<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.11905.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                    <subj-group>
                        <subject>Biomacromolecule-Ligand Interactions</subject>
                    </subj-group>
                    <subj-group>
                        <subject>Theory &amp; Simulation</subject>
                    </subj-group>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Virtual-screening workflow tutorials and prospective results from the Teach-Discover-Treat competition 2014 against malaria</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 3 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Riniker</surname>
                        <given-names>Sereina</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-1893-4031</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Landrum</surname>
                        <given-names>Gregory A.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-6279-4481</uri>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Montanari</surname>
                        <given-names>Floriane</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Villalba</surname>
                        <given-names>Santiago D.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Maier</surname>
                        <given-names>Julie</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <xref ref-type="aff" rid="a5">5</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Jansen</surname>
                        <given-names>Johanna M.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a6">6</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Walters</surname>
                        <given-names>W. Patrick</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a7">7</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Shelat</surname>
                        <given-names>Anang A.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a5">5</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Laboratory of Physical Chemistry, ETH Z&#x00fc;rich, Z&#x00fc;rich, Switzerland</aff>
                <aff id="a2">
                    <label>2</label>T5 Informatics GmbH, Basel, Switzerland</aff>
                <aff id="a3">
                    <label>3</label>Pharmacoinformatics Research Group, Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria</aff>
                <aff id="a4">
                    <label>4</label>IMP - Research Institute of Molecular Pathology, Vienna Biocenter, Vienna, Austria</aff>
                <aff id="a5">
                    <label>5</label>Department of Chemical Biology &amp; Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA</aff>
                <aff id="a6">
                    <label>6</label>Department of Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, CA, USA</aff>
                <aff id="a7">
                    <label>7</label>Relay Therapeutics, Cambridge, MA, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:sriniker@ethz.ch">sriniker@ethz.ch</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:Anang.Shelat@stjude.org">Anang.Shelat@stjude.org</email>
                </corresp>
                <fn fn-type="con">
                    <p>SR and GL have created and applied Workflow 1. FM and SV have created and applied workflow 2. AS and JM have performed the follow-up assay. JJ and PW are members of the TDT steering committee and have organized the acquisition of chemical substances from vendors for testing in the follow-up assay.</p>
                </fn>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>The authors declare no competing financial interests.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>17</day>
                <month>7</month>
                <year>2017</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2017</year>
            </pub-date>
            <volume>6</volume>
            <elocation-id>1136</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>11</day>
                    <month>7</month>
                    <year>2017</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Riniker S et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/6-1136/pdf"/>
            <abstract>
                <p>The first challenge in the 2014 competition launched by the Teach-Discover-Treat (TDT) initiative asked for the development of a tutorial for ligand-based virtual screening, based on data from a primary phenotypic high-throughput screen (HTS) against malaria. The resulting Workflows were applied to select compounds from a commercial database, and a subset of those were purchased and tested experimentally for anti-malaria activity. Here, we present the two most successful Workflows, both using machine-learning approaches, and report the results for the 114 compounds tested in the follow-up screen. Excluding the two known anti-malarials quinidine and amodiaquine and 31 compounds already present in the primary HTS, a high hit rate of 57% was found.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Malaria</kwd>
                <kwd>neglected diseases</kwd>
                <kwd>virtual screening</kwd>
                <kwd>machine learning</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100008273">
                    <funding-source>Novartis Institutes for BioMedical Research</funding-source>
                </award-group>
                <funding-statement>SR thanks the Novartis Institutes for BioMedical Research education office for a Presidential Postdoctoral Fellowship.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Teach-Discover-Treat (TDT) is an initiative that aims to provide high-quality tutorials of important tasks in computer-aided drug discovery, in order to impact education and drug discovery for neglected diseases
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>. The TDT steering committee consists of computational chemists from both academia and industry. To encourage the creation of high-quality tutorials by the computational chemistry community, competitions are launched with a series of different challenges, and the results/tutorials are made available through the website of the initiative (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org">http://www.tdtproject.org</ext-link>). The competitions are open to everybody. After the first successful competition in 2012
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>, a second competition was launched in 2014, with four challenges. In this study, we focus on Challenge 1: ligand-based virtual screening (VS) against malaria. The goal was to build a predictive model for anti-malaria activity based on a phenotypic high-throughput screen (HTS), and to use that model subsequently to select the next set of compounds for screening. In a ligand-based VS, typically no structural information of the target is available, and thus the prediction of potentially active compounds is based on the principle that similar molecules exhibit similar activity
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>. The challenge is thereby to find an appropriate molecular description for similarity, which can depend heavily on the compound selection and/or target
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>. In recent years, machine-learning (ML) methods have emerged as an attractive tool to boost the predictive power of ligand-based VS approaches
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-12">12</xref>
                </sup>.</p>
            <p>Malaria is caused in humans by several species of the protozoan parasite 
                <italic toggle="yes">Plasmodium</italic>. The most lethal species is 
                <italic toggle="yes">Plasmodium falciparum</italic> (Pf), which causes organ failure and accumulates in the brain capillaries if left untreated. Malaria is still one of the most prevalent and deadly diseases in Africa, Asia and the Americas, with an estimate of 198 million cases in 2013 leading to approximately 584,000 deaths according to the 2014 world malaria report of the World Health Organization (WHO)
                <sup>
                    <xref ref-type="bibr" rid="ref-13">13</xref>
                </sup>. Recent advances in malaria research and drug discovery have been reviewed
                <sup>
                    <xref ref-type="bibr" rid="ref-14">14</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref-19">19</xref>
                </sup>. The anti-malaria drugs can be broadly classified into three groups: (i) compounds that interfere with the heme detoxification, (ii) compounds that target folate metabolism, and (iii) compounds that inhibit mitochondrial electron transport. The current standard of care for uncomplicated malaria is artemisinin-based combination therapies. Artemisinins belong to the third group of anti-malaria drugs and rapidly kill all the blood stages of the parasite, however, they are also cleared in a short time
                <sup>
                    <xref ref-type="bibr" rid="ref-20">20</xref>
                </sup>. Unfortunately, the emergence of resistant strains has become a major problem in recent years
                <sup>
                    <xref ref-type="bibr" rid="ref-21">21</xref>,
                    <xref ref-type="bibr" rid="ref-22">22</xref>
                </sup>, requiring the development of new and possibly orthogonal drugs. In the past, whole-cell phenotypic screening campaigns against Pf have been successful in identifying new lead compounds
                <sup>
                    <xref ref-type="bibr" rid="ref-23">23</xref>
                </sup>.</p>
            <p>Challenge 1 of the 2014 TDT competition involved three tasks: (i) analysis of the data from a single-concentration phenotypic HTS of 305,568 compounds, including hit-list triaging and selection of compounds for a follow-up screen with EC
                <sub>50</sub> measurement, (ii) building and validation of a predictive anti-malaria activity model, including a held-out test-set of 1056 compounds, and (iii) follow-up hit finding by applying the predictive model to rank-order a large dataset of commercially available compounds. The top 1000 molecules of this ranked list were considered further for experimental testing. For training, the challenge provided results for 305,568 compounds from the primary HTS, as well as EC
                <sub>50</sub> data from a follow-up confirmatory screen for a subset of the compounds.</p>
            <p>In this study, we present the results of two Workflows. Workflow 1 was the overall winner of the competition, and Workflow 2 showed the best performance on the held-out test set measured in the phenotypic Pf screen. Note that the two Workflows interpreted the challenge differently. In Workflow 1, only data from the primary HTS was used in the training of the predictive model in order to mimic the early phase of a drug discovery campaign. In Workflow 2, the EC
                <sub>50</sub> data from the confirmatory assay was taken into account in order to improve the labelling of the training set. Each Workflow provided a ranked list of the top 1000 molecules, from which a total of 114 compounds (80 from Workflow 1 and 38 from Workflow 2, four were in common) were selected based on vendor availability for screening in a Pf phenotypic assay. Excluding the two known anti-malarials quinidine and amodiaquine and the 31 compounds already present in the primary HTS, 46 of 81 compounds were found to be active in the follow-up assay, which corresponds to a hit rate of 57%.</p>
        </sec>
        <sec sec-type="materials | methods">
            <title>Methods</title>
            <p>The basis for the virtual screening workflows was a phenotypic high-throughput screen against Pf with 305,568 compounds, together with a confirmatory dose-response screen for 1524 compounds, which are reported in 
                <xref ref-type="bibr" rid="ref-23">23</xref>. The data is deposited in ChEMBL as part of the Neglected Tropical Diseases set (
                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chemblntd/">ChEMBL-NTD</ext-link>)
                <sup>
                    <xref ref-type="bibr" rid="ref-24">24</xref>
                </sup>. The data is also available on the TDT website (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org/challenge-1---malaria-hts.html">http://www.tdtproject.org/challenge-1---malaria-hts.html</ext-link>). In addition, an external held-out test set with 1056 molecules was provided for comparison of submissions
                <sup>
                    <xref ref-type="bibr" rid="ref-25">25</xref>
                </sup>. This dataset was generated in the laboratory of R. K. Guy in 2014, following the same procedure as described in 
                <xref ref-type="bibr" rid="ref-23">23</xref>, at the time of the TDT competition. Results for this held-out set are given in the 
                <xref ref-type="other" rid="SM1">Supplementary material</xref>.</p>
        </sec>
        <sec>
            <title>Workflow 1</title>
            <p>The tutorial was written in the form of an IPython notebook and a series of Python scripts for the computationally demanding tasks to be executed separately. The tutorial is available on the TDT website (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org">http://www.tdtproject.org</ext-link>) and on GitHub (
                <ext-link ext-link-type="uri" xlink:href="https://github.com/sriniker/TDT-tutorial-2014">https://github.com/sriniker/TDT-tutorial-2014</ext-link>). The tutorial makes use of a number of open-source Python libraries: the cheminformatics toolkit RDKit version 2013.09 (
                <ext-link ext-link-type="uri" xlink:href="http://www.rdkit.org">http://www.rdkit.org</ext-link>), the machine-learning toolkit scikit-learn version 0.13 (
                <ext-link ext-link-type="uri" xlink:href="http://scikit-learn.org">http://scikit-learn.org</ext-link>), pandas for working with data tables, and libraries for scientific computing, numpy version 1.6.2 and scipy version 0.9.0. Figures are plotted using matplotlib version 1.1.0. The components of the Workflow are shown schematically in 
                <xref ref-type="fig" rid="f1">Figure 1</xref>.</p>
            <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                <label>Figure 1. </label>
                <caption>
                    <title>Schematic representation of Workflow 1.</title>
                </caption>
                <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure1.gif"/>
            </fig>
            <sec>
                <title>Data preprocessing</title>
                <p>The input for the workflow was the hit list from the phenotypic HT screen, with a classification into &#x2018;active&#x2019;, &#x2018;inactive&#x2019;, and &#x2018;ambiguous&#x2019; compounds
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>. From the original 305,568 compounds tested in the screen, 1528 were found to be active and 293,608 inactive. The 10,432 molecules with an ambiguous outcome were discarded.</p>
            </sec>
            <sec>
                <title>Task 1: Selection of 500 molecules for follow-up testing</title>
                <p>To triage the hit list in the first task, property filters (
                    <xref ref-type="table" rid="T1">Table 1</xref>) based on previously described filters
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>,
                        <xref ref-type="bibr" rid="ref-27">27</xref>
                    </sup> were applied for 
                    <italic toggle="yes">in silico</italic> post-processing of the primary HTS data, which resulted in 1512 remaining active compounds.</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Property filters for 
                            <italic toggle="yes">in silico</italic> post-processing of primary HTS data.</title>
                        <p>These filters are used in Workflow 1.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Property</th>
                                <th align="left" colspan="1" rowspan="1">Range</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Molecular weight</td>
                                <td align="left" colspan="1" rowspan="1">100&#x2013;700 g/mol</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Number of heavy atoms</td>
                                <td align="left" colspan="1" rowspan="1">5&#x2013;50</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Number of rotatable bonds</td>
                                <td align="left" colspan="1" rowspan="1">0&#x2013;12</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Hydrogen-bond donors</td>
                                <td align="left" colspan="1" rowspan="1">0&#x2013;5</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Hydrogen-bond acceptors</td>
                                <td align="left" colspan="1" rowspan="1">0&#x2013;10</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Hydrophobicity</td>
                                <td align="left" colspan="1" rowspan="1">-5 &lt; logP &lt; 7.5</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>	Next, the active molecules were checked for potentially problematic substructures using the PAINS filters described in 
                    <xref ref-type="bibr" rid="ref-28">28</xref>. 1225 molecules passed these filters. From these, 500 molecules had to be picked for testing in a confirmatory assay. While making this selection, a balance between the desire to have a good sampling of the chemical space covered by the primary actives and the desire to get some structure-activity relationship (SAR) information from the confirmatory assay had to be found. The compounds were therefore clustered using the Butina algorithm
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> based on Tanimoto similarity with a cutoff = 0.5. The Tanimoto similarity was calculated using RDKit fingerprints (a subgraph-based fingerprint similar to the Daylight fingerprint), with a maximum path length of five. 304 clusters were found, with only 40 clusters having more than five members. The cluster centers provide a set of diverse seeds. To ensure the chance to obtain information about SAR, molecules around the cluster centers were selected: Starting with the largest cluster, the five molecules most similar to the cluster center (or 50% of the cluster members if the cluster contained less than 5 molecules) were picked.</p>
            </sec>
            <sec>
                <title>Task 2: Prediction of anti-malarial activity for the held-out test set</title>
                <p>Three different machine-learning (ML) models together with three different molecular fingerprints were tested for the predictive model in task 2. The ML methods were random forest (RF)
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup>, Naive Bayes (NB) and logistic regression (LR), which showed a good performance in a previous benchmarking study
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup>. The RF models were built using 100 trees, a maximum depth of 100, and minimum one sample in a leaf. For NB and LR, the default parameters in scikit-learn were used. The fingerprints were atom pairs (AP)
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup>, RDKit fingerprint with a maximum path length of five (RDK5) and Morgan fingerprint with a radius of 2 (Morgan2)
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup>, and are described in more detail in 
                    <xref ref-type="bibr" rid="ref-8">8</xref>. In the version of the Workflow submitted in the competition, the AP and RDK5 fingerprints were hashed to 2048 bits, and the Morgan2 fingerprints to 1024 bits. Later on we found that a fingerprint size of 4096 bits resulted in better performances due to fewer collisions. To determine which ML method/fingerprint combinations performed best and should therefore be combined using heterogeneous classifier fusion
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup>, a retrospective evaluation was performed using the primary HTS data. Here, all data points from the primary screen were used (i.e. none of the property-/substructure-filters discussed above were applied) as some filters may be too strict and the ML methods are rather robust to noise. The data points were randomly split 50 times into a training set (90%) and a test set (10%). A ML model was built using the training set and the molecules in the test set were ranked based on the predicted probability to be active. From the ranked list, the receiver operating characteristic (ROC) curve was calculated and subsequently the area under the ROC curve (AUC) was determined. In addition, the enrichment factor at 5% was determined. A detailed discussion of the different evaluation methods is given in 
                    <xref ref-type="bibr" rid="ref-8">8</xref>. The results from the retrospective evaluation, averaged over the 50 repetitions, are listed in 
                    <xref ref-type="table" rid="T2">Table 2</xref>. Based on these results and the analysis of the diversity in the active molecules that were identified, a classifier fusion model was proposed based on RF with RDK5, RF with Morgan2 and LR with RDK5 (
                    <xref ref-type="table" rid="T2">Table 2</xref>). As a last step, a fusion model was trained using all data points of the primary HTS in order to obtain predictions for the held-out test set and for task 3.</p>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Evaluation results for anti-malaria activity prediction using a 90%-training and 10%-test set split for Workflow 1.</title>
                        <p>The random selection was repeated 50 times and the results were averaged over the repetitions. The maximum possible EF5% value is 20.0. Fingerprints with 4096 bits were used.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Method</th>
                                <th align="left" colspan="1" rowspan="1">AUC</th>
                                <th align="left" colspan="1" rowspan="1">STD
                                    <break/>AUC</th>
                                <th align="left" colspan="1" rowspan="1">EF5%</th>
                                <th align="left" colspan="1" rowspan="1">STD
                                    <break/>EF5%</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Similarity AP</td>
                                <td align="left" colspan="1" rowspan="1">0.88</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">13.94</td>
                                <td align="left" colspan="1" rowspan="1">0.69</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Similarity RDK5</td>
                                <td align="left" colspan="1" rowspan="1">0.88</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">13.75</td>
                                <td align="left" colspan="1" rowspan="1">0.74</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Similarity Morgan2</td>
                                <td align="left" colspan="1" rowspan="1">0.89</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">14.65</td>
                                <td align="left" colspan="1" rowspan="1">0.69</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">NB with AP</td>
                                <td align="left" colspan="1" rowspan="1">0.80</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">7.40</td>
                                <td align="left" colspan="1" rowspan="1">0.64</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">NB with RDK5</td>
                                <td align="left" colspan="1" rowspan="1">0.81</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">8.27</td>
                                <td align="left" colspan="1" rowspan="1">0.80</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">NB with Morgan2</td>
                                <td align="left" colspan="1" rowspan="1">0.85</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">10.42</td>
                                <td align="left" colspan="1" rowspan="1">0.98</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">LR with AP</td>
                                <td align="left" colspan="1" rowspan="1">0.88</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">12.53</td>
                                <td align="left" colspan="1" rowspan="1">0.92</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">LR with RDK5</td>
                                <td align="left" colspan="1" rowspan="1">0.91</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">14.99</td>
                                <td align="left" colspan="1" rowspan="1">0.80</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">LR with Morgan2</td>
                                <td align="left" colspan="1" rowspan="1">0.88</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">13.30</td>
                                <td align="left" colspan="1" rowspan="1">0.75</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">RF with AP</td>
                                <td align="left" colspan="1" rowspan="1">0.92</td>
                                <td align="left" colspan="1" rowspan="1">0.01</td>
                                <td align="left" colspan="1" rowspan="1">14.66</td>
                                <td align="left" colspan="1" rowspan="1">0.75</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">RF with RDK5</td>
                                <td align="left" colspan="1" rowspan="1">0.93</td>
                                <td align="left" colspan="1" rowspan="1">0.02</td>
                                <td align="left" colspan="1" rowspan="1">15.38</td>
                                <td align="left" colspan="1" rowspan="1">0.70</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">RF with Morgan2</td>
                                <td align="left" colspan="1" rowspan="1">0.93</td>
                                <td align="left" colspan="1" rowspan="1">0.01</td>
                                <td align="left" colspan="1" rowspan="1">15.28</td>
                                <td align="left" colspan="1" rowspan="1">0.70</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Fusion model</td>
                                <td align="left" colspan="1" rowspan="1">0.93</td>
                                <td align="left" colspan="1" rowspan="1">0.01</td>
                                <td align="left" colspan="1" rowspan="1">15.75</td>
                                <td align="left" colspan="1" rowspan="1">0.73</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec>
                <title>Task 3: Selection of 1000 new candidates from the eMolecules catalog</title>
                <p>In task 3, the goal was to select a list of 1000 compounds from the eMolecules (
                    <ext-link ext-link-type="uri" xlink:href="https://www.emolecules.com">https://www.emolecules.com</ext-link>) catalog, with nearly 5.5 million commercially available compounds. As a first step, the molecules were filtered using the property filters described in 
                    <xref ref-type="table" rid="T1">Table 1</xref> except logP. logP was not applied at this stage to reduce the computational cost. This resulted in approximately 4.4 million compounds. For these, molecular fingerprints (RDK5 and Morgan2) were generated with 4096 bits and the anti-malaria activity was predicted using the fusion model trained on the primary HTS in task 2. The top ranked 10,000 compounds were taken for further selection. The logP filter (see 
                    <xref ref-type="table" rid="T1">Table 1</xref>) and PAINS substructure filters were applied at this point. Filtering resulted in 7955 compounds. To select 1000 molecules from these, the following procedure was applied:</p>
                <list list-type="bullet">
                    <list-item>
                        <p>The highest-ranked molecule is selected as first cluster center.</p>
                    </list-item>
                    <list-item>
                        <p>Taking the next lower molecule, the similarity to the first molecule is calculated: 
                            <list list-type="bullet">
                                <list-item>
                                    <label>&#x2610;</label>
                                    <p>If the similarity is below 0.5, the molecule is selected as a new cluster center.</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2610;</label>
                                    <p>If the similarity is above 0.85 and the cluster does not contain 6 molecules yet (including the cluster center), the molecule is selected and added to the cluster.</p>
                                </list-item>
                                <list-item>
                                    <label>&#x2610;</label>
                                    <p>Else the molecule is discarded.</p>
                                </list-item>
                            </list>
                        </p>
                    </list-item>
                </list>
                <p>The procedure was continued until 1000 compounds were selected. Unfortunately, a bug in the selection step of the original tutorial resulted in the 1000 compounds being randomly selected from the top ranked 10,000 compounds. In addition, compounds already in the primary HTS used for training were not explicitly removed from the eMolecules catalog. A corrected version of the tutorial is provided on GitHub (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/sriniker/TDT-tutorial-2014">https://github.com/sriniker/TDT-tutorial-2014</ext-link>).</p>
            </sec>
        </sec>
        <sec>
            <title>Workflow 2</title>
            <p>The tutorial is available on the TDT website (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org">http://www.tdtproject.org</ext-link>) and on GitHub (
                <ext-link ext-link-type="uri" xlink:href="https://github.com/sdvillal/tdt-malaria-followup">https://github.com/sdvillal/tdt-malaria-followup</ext-link>). RDKit version 2013_09_2 (
                <ext-link ext-link-type="uri" xlink:href="http://www.rdkit.org">http://www.rdkit.org</ext-link>) was used to read the SMILES strings, compute descriptors and fingerprints. Scikit-learn version 0.14 (
                <ext-link ext-link-type="uri" xlink:href="http://scikit-learn.org">http://scikit-learn.org</ext-link>) was used to build the models.</p>
            <sec>
                <title>Data preprocessing</title>
                <p>The input was again the original primary HTS data
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> with 1528 active compounds, 293,608 inactive compounds and 10,432 molecules with an ambiguous outcome. In addition, pEC
                    <sub>50</sub> data from a dose-response confirmatory screen for 1524 compounds
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> was taken into account. Compounds were relabeled using, when available, the confirmatory pEC
                    <sub>50</sub> data. Any compound with a pEC
                    <sub>50</sub> of at least 5 was considered positive for anti-malarial activity independent of the original classification. As a result, 296 molecules were relabeled from positive to negative; 192 molecules were relabeled from ambiguous to negative; 52 molecules relabeled from ambiguous to positive; 4 molecules were relabeled from negative to positive. The final dataset contained 1288 compounds labeled as positives, 294,092 as negatives, and 10,188 as ambiguous. Ambiguous compounds were not considered for modeling.</p>
            </sec>
            <sec>
                <title>Descriptors and unfolded circular fingerprints</title>
                <p>To describe the chemical structures of the compounds, the 196 RDKit descriptors available by default were computed. This first set will be referred to as &#x201c;RDKit descriptors&#x201d; set. Morgan fingerprints of both extended connectivity (ECFP) and functional class (FCFP) types
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup> were computed with a radius of up to 200 (meaning that all possible substructures are enumerated for each compound). Typically, circular fingerprints are hashed and folded to a fixed size, but this may lead to collisions, i.e. two different substructures are hashed to the same bit in the folded fingerprint. To avoid this problem, hashing or folding was not used in Workflow 2. All the existing substructures were saved as SMILES strings and uniquely encoded by a large bitset containing all substructures occurring in the training set. The unfolded ECFP and FCFP fingerprints were appended together in one vector.</p>
            </sec>
            <sec>
                <title>Model building, validation and selection</title>
                <p>Random forests
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> and extremely randomized trees
                    <sup>
                        <xref ref-type="bibr" rid="ref-33">33</xref>
                    </sup> of 10, 20, 50, 100, 500, 1000, 2000, 4000 and 6000 trees were computed on the RDKit descriptors set, using multiple random seeds. Both methods use bagging to select instances for building each tree. As a result, for each individual tree, some instances were not used for training and are referred to as &#x201c;out-of-bag&#x201d;. These instances can be used for an unbiased estimate of the prediction error, instead of performing a computationally expensive cross-validation. Therefore, the out-of-bag scores were used as a measure of the quality of the models, and AUC, accuracy and enrichment at 5% were computed from these scores. The ensemble of trees with 6000 trees gave the best results and was therefore selected for deployment (i.e. used for the computation of the final scores for the unlabeled datasets).</p>
                <p>	After a first exploration of multiple parameters for logistic regression on the fingerprint set by cross-validation, the following parameters for building the models were chosen: a penalty of l1 or l2, a regularization parameter C of 1 or 5, a default tolerance of 0.0001, and the fingerprints were kept unfolded. Cross-validation was computed for 3, 5, 7 or 10 folds with five different seeds each. For each fold, the AUC and enrichment at 5% were computed. When a fold reached an AUC below 0.88, then the rest of the cross-validation was skipped and the next model was built.</p>
                <p>	The best models among the many logistic regressions models for which all folds could be completed were the ones with a penalty of l1 and C of 1 and an average AUC over all folds over 0.92; as well as those with a penalty of l2 and C of 5 and an average AUC over all folds over 0.93. These particular models were selected for deployment (i.e. used for the computation of the final scores for the three tasks).</p>
            </sec>
            <sec>
                <title>Task 1: Selection of 500 molecules for follow-up testing</title>
                <p>The first task involved the selection of 500 molecules from the primary HTS set with promising activity for follow-up confirmatory measurements. For this, the predictions of the deployment models were combined by plain averaging of the model scores. Note that this corresponds to model fitting scores, since the screening set is the training set used for building the deployment models. The 500 molecules with the highest average scores were selected for the follow-up testing.</p>
            </sec>
            <sec>
                <title>Task 2: Prediction of anti-malarial activity for the held-out test set</title>
                <p>In 1992, Wolpert introduced the concept of stacked generalization
                    <sup>
                        <xref ref-type="bibr" rid="ref-34">34</xref>
                    </sup> to combine different models and boost the predictive power of the resulting ensemble. Here, feature-weighted linear stacking was used to combine our deployment models
                    <sup>
                        <xref ref-type="bibr" rid="ref-35">35</xref>
                    </sup>. For this, a linear regression was trained using the out-of-bag scores (for the ensemble of trees models) and cross-validation scores (for the logistic regression models) as independent variable, and antimalarial activity as dependent variable. The resulting linear combination of models was applied to obtain the final score for the 1056 compounds of the held-out test set.</p>
            </sec>
            <sec>
                <title>Task 3: Selection of 1000 new candidates from the eMolecules catalog</title>
                <p>For the selection of new candidates, the same feature-weighted linear stacking as described for Task 2 was used. The resulting linear combination of individual model scores was applied to obtain the final score for the compounds of the eMolecules catalog (
                    <ext-link ext-link-type="uri" xlink:href="https://www.emolecules.com">https://www.emolecules.com</ext-link>). The 1000 top-scoring compounds were selected as new candidates for further anti-malaria screening. Compounds already present in the primary HTS and the confirmatory screen used for training were not explicitly removed from the eMolecules catalog.</p>
            </sec>
        </sec>
        <sec>
            <title>Final selection process</title>
            <p>From the two lists of 1000 new candidates, 114 molecules were selected for testing in a follow-up assay based on availability at vendors who agreed to be TDT sponsors. The set included two known anti-malarials quinidine (proposed by Workflow 1) and amodiaquine (proposed by Workflow 2). Compounds that were already in the primary HTS and the confirmatory screen provided by the TDT challenge were not removed.</p>
        </sec>
        <sec>
            <title>Experimental procedures</title>
            <p>The potency of new candidates was determined as reported earlier
                <sup>
                    <xref ref-type="bibr" rid="ref-23">23</xref>
                </sup>. 
                <italic toggle="yes">Plasmodium falciparum</italic> strain 3D7 was acquired from the Malaria Research and Reference Reagent Resource Center (MR4, catalog #MRA-102). Briefly, asynchronous parasites were maintained in culture based on the method of Trager
                <sup>
                    <xref ref-type="bibr" rid="ref-36">36</xref>
                </sup>. Parasites were grown in presence of fresh group O-positive erythrocytes (Key Biologics, LLC, Memphis, TN) in Petri dishes at a hematocrite of 4-6% in RPMI based media (RPMI 1640 supplemented with 0.5% AlbuMAX II, 25 mM HEPES, 25 mM NaHCO
                <sub>3</sub> (pH 7.3), 100 &#x00b5;g/mL hypoxanthine, and 5 &#x00b5;g/mL gentamycin). Cultures were incubated at 37&#x00b0;C in a gas mixture of 90% N
                <sub>2</sub>, 5% O
                <sub>2</sub>, 5% CO
                <sub>2</sub>. For IC
                <sub>50</sub> determinations, 20 &#x00b5;l of RPMI 1640 with 5 &#x00b5;g/ml gentamycin were dispensed per well in an assay plate (Corning 384-well microtiter plate, clear bottom, tissue culture treated, catalog no. 8807BC). An amount of 60 nl of compound, previously serial diluted in a separate 384-well white polypropylene plate (Corning, catalog no. 8748BC), was dispensed to the assay plate by hydrodynamic pin transfer (FP1S50H, V&amp;P Scientific Pin Head) and then an amount of 20 &#x00b5;l of a synchronized culture suspension (1% rings, 4% hematocrite) was added per well, thus making a final hematocrite and parasitemia of 2% and 1%, respectively. Assay plates were incubated for 72 h, and the parasitemia was determined by a method previously described
                <sup>
                    <xref ref-type="bibr" rid="ref-37">37</xref>
                </sup>. An amount of 10 &#x00b5;l of the following solution in PBS (10X Sybr Green I, 0.5% v/v triton, 0.5 mg/ml saponin) was added per well. Assay plates were shaken for 1 min, incubated in the dark for 90 min, then read with the Envision spectrophotomer at Ex/Em of 485 nm/535 nm.</p>
            <p>EC
                <sub>50</sub> values were calculated using a four-parameter logistic equation as described previously
                <sup>
                    <xref ref-type="bibr" rid="ref-23">23</xref>
                </sup>. Compounds were arrayed in ten concentrations, varying from approximately 10 &#x00b5;M to 5 nM, and the R drc package was used to fit the observed response to the four-parameter Hill equation
                <sup>
                    <xref ref-type="bibr" rid="ref-38">38</xref>
                </sup>. The purity of all compounds was determined by UPLC (UV and ELSD purity average) and results from any compound with a purity below 95% were not reported.</p>
            <sec>
                <title>Analysis</title>
                <p>Morgan2 fingerprints
                    <sup>
                        <xref ref-type="bibr" rid="ref-32">32</xref>
                    </sup> and Tanimoto similarities were calculated using the RDKit. The scaffolds in the set of newly tested compounds were determined using the Bemis-Murcko algorithm
                    <sup>
                        <xref ref-type="bibr" rid="ref-39">39</xref>
                    </sup>.</p>
            </sec>
        </sec>
        <sec sec-type="results">
            <title>Results</title>
            <sec>
                <title>Held-out test set</title>
                <p>The external held-out test set of the TDT challenge consisted of 101 actives and 955 inactives. The performances of the ML models of Workflow 1 and Workflow 2 on the held-out test set (1056 molecules) are given in 
                    <xref ref-type="table" rid="T3">Table 3</xref>. For Workflow 1, the results using fingerprints with 1024/2048 bits or with 4096 bits are reported. Note that the maximum possible EF5% for the held-out test set is 10.5 (as the fraction 
                    <italic toggle="yes">&#x03c7;</italic> = 0.05 is smaller than the ratio of actives to inactives
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>), whereas it is 20.0 for the primary HTS dataset. Workflow 2 gave the best performance for the held-out test set from all five submissions to this TDT challenge. For Workflow 1, the version using 1024/2048 bits was the one submitted to the TDT challenge. Later, it was found that a substantial amount of collisions due to hashing occurred in the short fingerprints, which affected the performance. Using longer fingerprints (i.e. 4096 bits), the performance could be improved and was found to be similar to that of Workflow 2. This highlights the resistance to noise of the ML methods used, since in Workflow 1 the false positives in the primary data were included. In Workflow 2, these false positives were corrected using the information from the confirmatory screen.</p>
                <table-wrap id="T3" orientation="portrait" position="anchor">
                    <label>Table 3. </label>
                    <caption>
                        <title>Evaluation results for anti-malaria activity on the held-out test set (1056 molecules).</title>
                        <p>Predictions were obtained using the fusion models of Workflow 1 and the linear combination of model scores of Workflow 2. The maximum possible EF5% value is 10.5.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Method</th>
                                <th colspan="1" rowspan="1">AUC</th>
                                <th colspan="1" rowspan="1">EF5%</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Workflow 1 - Fusion model
                                    <break/>(1024/2048 bits)</td>
                                <td colspan="1" rowspan="1">0.74</td>
                                <td colspan="1" rowspan="1">2.76</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Workflow 1 - Fusion model
                                    <break/>(4096 bits)</td>
                                <td colspan="1" rowspan="1">0.75</td>
                                <td colspan="1" rowspan="1">4.75</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Workflow 2</td>
                                <td colspan="1" rowspan="1">0.79</td>
                                <td colspan="1" rowspan="1">4.34</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>	For both Workflows, the AUC and the EF5% values were found to be substantially lower for the held-out test set compared to the values for the 10%-test split in 
                    <xref ref-type="table" rid="T2">Table 2</xref>. Although the size distribution and flexibility of the compounds in the different sets were similar (
                    <xref ref-type="table" rid="T4">Table 4</xref>) and the similarities within and across the datasets were generally low (left panel in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>), there are slightly more highly similar compounds among the actives of the primary HTS (as in the original classification
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>) than between those and the actives in the held-out test set (right panel in 
                    <xref ref-type="fig" rid="f2">Figure 2</xref>). In addition, there were some highly similar compounds between the actives in the primary HTS and the inactives in the held-out test set.</p>
                <table-wrap id="T4" orientation="portrait" position="anchor">
                    <label>Table 4. </label>
                    <caption>
                        <title>Properties of the molecules in the primary HTS and in the held-out test set.</title>
                        <p>The compounds in the primary HTS were split into 1528 actives and 293,606 inactives. The compounds in the held-out test set were split into 101 actives and 955 inactives. For the primary screen, the original classification into actives and inactives was used
                            <sup>
                                <xref ref-type="bibr" rid="ref-23">23</xref>
                            </sup>. For the held-out test set, a cutoff of 10 &#x03bc;M was employed.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Dataset</th>
                                <th align="left" colspan="1" rowspan="1">Median
                                    <break/>molecular
                                    <break/>weight
                                    <break/>[g/mol]</th>
                                <th colspan="1" rowspan="1">Median
                                    <break/>number of
                                    <break/>rotatable
                                    <break/>bonds</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Actives in primary HTS</td>
                                <td colspan="1" rowspan="1">394.0 </td>
                                <td colspan="1" rowspan="1">5.0</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Inactives in primary HTS</td>
                                <td colspan="1" rowspan="1">373.1</td>
                                <td colspan="1" rowspan="1">5.0</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Actives in held-out test set</td>
                                <td colspan="1" rowspan="1">387.2</td>
                                <td colspan="1" rowspan="1">5.0</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Inactives in held-out test set</td>
                                <td colspan="1" rowspan="1">374.1</td>
                                <td colspan="1" rowspan="1">5.0</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Similarity distributions for the molecules in the primary HTS and the held-out test set.</title>
                        <p>Normalized Tanimoto similarity distribution using a Morgan2 fingerprint
                            <sup>
                                <xref ref-type="bibr" rid="ref-32">32</xref>
                            </sup> within the actives in the primary HTS (blue), and between them and the actives (green) and inactives (red) of the held-out test set. The full distributions (left) and the slice between 0.8 and 1.0 similarity (right) are shown. For the primary screen, the original classification into actives and inactives was used
                            <sup>
                                <xref ref-type="bibr" rid="ref-23">23</xref>
                            </sup>. For the held-out test set, a cutoff of 10 &#x03bc;M was employed.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure2.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Prospective phenotypic screen</title>
                <p>From the combined set of 2000 candidates predicted by Workflow 1 and Workflow 2, 114 were tested in a follow-up assay (80 from Workflow 1 and 38 from Workflow 2, four compounds were predicted by both Workflows). The identifiers, SMILES, EC
                    <sub>50</sub> values and raw data for all 114 compounds are given in the 
                    <xref ref-type="other" rid="SM1">Supplementary material</xref>. Of these, two were known anti-malarials (quinidine and amodiaquine) selected as positive control. In addition, 31 compounds (six from Workflow 1 and 28 from Workflow 2, three were in common) were already present in the primary HTS and confirmatory screen provided by the TDT challenge, as such molecules were not explicitly removed from the eMolecules catalog before the virtual screen (
                    <xref ref-type="other" rid="ST1">Supplementary Table S1</xref>). One of these compounds, 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembldb/index.php/compound/inspect/CHEMBL592096">SJ000154494</ext-link> (
                    <xref ref-type="fig" rid="f3">Figure 3</xref>, EC
                    <sub>50</sub> = 0.44 &#x00b5;M as measured in this study) was found inactive in the previous primary screen and confirmatory screen
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>, which was likely a false negative in the latter screen because dose-response testing immediately following the primary screen was done using compounds from stock solutions ranging in age, whereas the current experiment was performed on fresh powder.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>Compound 
                            <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL592096">SJ000154494</ext-link> (EC
                            <sub>50</sub> = 0.44 &#x00b5;M as measured in this study).</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure3.gif"/>
                </fig>
                <p>	The results for the remaining 81 new compounds and the two known anti-malarials are listed in 
                    <xref ref-type="table" rid="T5">Table 5</xref>. A list of all 114 compounds, including SMILES is provided as a separate file in the 
                    <xref ref-type="other" rid="ST1">Supplementary material</xref>. Partially active or single-point active molecules were counted as inactives. As the list of 1000 compounds in Workflow 1 was randomly selected from the top 10,000 ranked compounds in the eMolecules database, the ranks in the latter list are also reported in 
                    <xref ref-type="table" rid="T5">Table 5</xref>. From the nine molecules proposed by Workflow 2, only two were not in the top 10,000 list from Workflow 1, indicating that the two approaches pick generally similar features but do not score them in the same manner. Of the 81 new compounds, 46 were found to be active, resulting in an overall hit rate of 57%. In more detail, Workflow 1 gave a hit rate of 52% and Workflow 2 a hit rate of 100%. Due to the small number of compounds tested, we cannot judge if this difference in hit rate is significant. As the TDT initiative relies on contributions of compounds, a more systematic assessment is outside the scope of this effort. Interestingly, the most active compounds were ranked rather low in the top-1000 list of Workflow 2 and the top-10,000 list of Workflow 1 compared to the other molecules tested, which emphasizes again that it is important in ligand-based VS to pick the compounds for follow-up testing relatively broadly from the top fraction.</p>
                <table-wrap id="T5" orientation="portrait" position="anchor">
                    <label>Table 5. </label>
                    <caption>
                        <title>Results from the follow-up assay for 83 compounds.</title>
                        <p>The columns are as follows: EC
                            <sub>50</sub> values, the final scores (active or inactive), and the ranks in the Workflows 1 and 2. Partially active or single-point active compounds were considered inactives (marked by italic font). ChEMBL-NTD datasets: Novartis-GNF Malaria Box (N)
                            <sup>
                                <xref ref-type="bibr" rid="ref-40">40</xref>
                            </sup>, St. Jude Children's Research Hospital Dataset (J)
                            <sup>
                                <xref ref-type="bibr" rid="ref-24">24</xref>
                            </sup>, GSK TCAMS (G)
                            <sup>
                                <xref ref-type="bibr" rid="ref-41">41</xref>
                            </sup>, DNDi HAT set (D). Compounds marked with (P) were tested in PubChem assays.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Identifier</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">EC
                                    <sub>50</sub>
                                    <break/>[&#x03bc;M]</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Score</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Rank
                                    <break/>(1000)
                                    <break/>Workflow 1</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Rank (top
                                    <break/>10'000)
                                    <break/>Workflow 1</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Rank (top
                                    <break/>1000)
                                    <break/>Workflow 2</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Known Datasets</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL682">SJ000110703</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.025</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">3907</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">853</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Amodiaquine</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1294">SJ000285572</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.060</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">867</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6589</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Quinidine</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866784</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.099</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">4544</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">931</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866752</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.14</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">3108</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">826</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866753</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.18</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">725</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5240</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866807</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.20</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">613</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4337</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000361770</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.28</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">3394</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">952</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866781</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.29</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">476</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3299</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL580340">SJ000866764</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.39</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">2174</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">720</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">N, P (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866760</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.72</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">300</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1739</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1615645">SJ000866797</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.76</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">868</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (anti-malaria:
                                    <break/>AID504832,
                                    <break/>AID504834) (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866778</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.77</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">-</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">984</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL548572">SJ000866810</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.84</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">974</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">100</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">N, J, G (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL3304440">SJ000866811</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.92</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">829</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6197</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (not anti-malaria)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1619870">SJ000866815</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">0.98</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="right" colspan="1" rowspan="1" valign="top">9752</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">569</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (anti-malaria:
                                    <break/>AID504382) (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866767</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">597</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4262</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866780</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">171</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">688</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866773</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">697</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5138</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866792</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">950</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7129</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL581866">SJ000866800</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1.9</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">714</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5205</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">N (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000377329</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">807</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6068</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866786</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2.3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">784</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5832</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000364456</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2.4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">808</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6073</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL2095075">SJ000866779</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">456</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3069</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">D (inactive)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866794</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">559</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3935</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866757</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">901</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6813</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866813</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">509</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3613</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866809</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">403</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2603</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000377299</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">844</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6318</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866777</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">584</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4159</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866750</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5.4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">798</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6016</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866789</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.8</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">830</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6198</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866790</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">409</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2624</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866755</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.4</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">789</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5923</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL2442249">SJ000399327</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">617</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4383</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (anti-malaria:
                                    <break/>AID504832,
                                    <break/>AID504834) (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866806</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">785</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5850</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866759</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.3</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">927</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7004</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866747</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">610</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4303</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL2094275">SJ000866799</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.5</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">546</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3852</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">989</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">D (active)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866766</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">9.7</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">510</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3614</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866749</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">561</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3942</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866793</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">10.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">590</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4191</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866768</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">12.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">444</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2939</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866762</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">12.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">474</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3282</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866788</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">238</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1166</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866798</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">852</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6416</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000420481</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">17.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">484</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3378</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866776</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">Active</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">35</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">61</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866769</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3.7</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">768</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5649</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866796</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4.6</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">521</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3684</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866804</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.1</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">264</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1366</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866765</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">18</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">31</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866771</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">600</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4272</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866783</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.2</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">747</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5414</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (anti-malaria:
                                    <break/>AID504832,
                                    <break/>AID504834) (inactive)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866802</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7.9</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">401</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2598</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866808</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">570</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4051</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866748</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">11.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">850</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6407</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866785</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">19.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">711</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5202</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866751</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">197</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">880</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL3209657">SJ000389261</ext-link>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">998</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7634</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">P (anti-malaria:
                                    <break/>AID504832,
                                    <break/>AID504834) (inactive)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866758</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">8.8</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">859</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6525</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866746</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">15.0</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">816</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6110</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866782</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">14</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">23</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866803</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">360</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2269</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866805</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">379</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2468</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000388303</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">411</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2630</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866770</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">437</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2879</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866801</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">504</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3588</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866772</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">506</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3600</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866775</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">563</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3948</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866763</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">618</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4385</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866761</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">620</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">4405</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866745</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">704</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5167</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866791</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">741</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5376</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866812</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">779</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5792</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866795</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">Inactive</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">952</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">7149</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866756</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">191</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">852</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866754</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">250</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">1257</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866814</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">404</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">2606</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000394036</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">457</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3073</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866774</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">547</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">3858</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000866787</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">695</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">5124</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">SJ000391199</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td align="left" colspan="1" rowspan="1" valign="top">
                                    <italic toggle="yes">Inactive</italic>
                                </td>
                                <td align="right" colspan="1" rowspan="1" valign="top">828</td>
                                <td align="right" colspan="1" rowspan="1" valign="top">6194</td>
                                <td colspan="1" rowspan="1" valign="top"/>
                                <td colspan="1" rowspan="1" valign="top"/>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>	For Workflow 1, six of the 73 new compounds were tested previously in anti-malaria activity assays found in ChEMBL-NTD (
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chemblntd/">https://www.ebi.ac.uk/chemblntd/</ext-link>) and PubChem (
                    <ext-link ext-link-type="uri" xlink:href="https://pubchem.ncbi.nlm.nih.gov">https://pubchem.ncbi.nlm.nih.gov</ext-link>) and three of them were found to be active. Three main scaffolds covered 25 of the 73 compounds: thiazolidin-4-one-type, 8-hydroxyquinoline-type, and aminopyrimidine-type scaffolds (
                    <xref ref-type="table" rid="T6">Table 6</xref>). The compounds with the thiazolidin-4-one-type scaffold were the largest group. The scaffold can be seen as a variation of compound 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembldb/index.php/compound/inspect/CHEMBL592096">SJ000154494</ext-link> (
                    <xref ref-type="fig" rid="f3">Figure 3</xref>), but the compounds in this group were mostly inactive. In addition, the scaffold may be a potential PAINS substructure due to its similarity with rhodanine, although it is currently not part of the filters
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>. The 8-hydroxyquinoline scaffold is a phenolic Mannich base, which is a PAINS substructure. The most interesting scaffold is the aminopyrimidine-type with a second 
                    <italic toggle="yes">N</italic>-alkyl substituent instead of a known 
                    <italic toggle="yes">N</italic>-aryl substituent. The most active compound of this series, SJ000866807, exhibits a good ligand efficiency with an EC
                    <sub>50</sub> of 0.2 &#x03bc;M and a molecular weight of only 266 g/mol. From this series of compounds only one (SJ000866811) was listed in PubChem, but this was in an assay for anti-cancer activity (AID 743276). However, similar compounds were previously reported in the Novartis-GNF Malaria Box
                    <sup>
                        <xref ref-type="bibr" rid="ref-40">40</xref>
                    </sup> (
                    <xref ref-type="fig" rid="f4">Figure 4</xref>).</p>
                <table-wrap id="T6" orientation="portrait" position="anchor">
                    <label>Table 6. </label>
                    <caption>
                        <title>The three main scaffolds present in the 73 compounds predicted by Workflow 1.</title>
                        <p>ChEMBL-NTD datasets: Novartis-GNF Malaria Box (N)
                            <sup>
                                <xref ref-type="bibr" rid="ref-40">40</xref>
                            </sup>, St. Jude Children's Research Hospital Dataset (J)
                            <sup>
                                <xref ref-type="bibr" rid="ref-24">24</xref>
                            </sup>, GSK TCAMS (G)
                            <sup>
                                <xref ref-type="bibr" rid="ref-41">41</xref>
                            </sup>, DNDi HAT set (D). Compounds marked with (P) were tested in PubChem assays.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <tbody>
                            <tr>
                                <td align="center" colspan="6" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T1.gif"/>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Identifier</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R1</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R2</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R3</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>EC
                                        <sub>50</sub> [&#x03bc;M]</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Known Datasets</bold>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000388303</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T2.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000391199</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T3.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000389261</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T4.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">6.0</td>
                                <td align="left" colspan="1" rowspan="1">P (anti-malaria:
                                    <break/>AID504832,
                                    <break/>AID504834)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000394036</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T5.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866774</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T6.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866791</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T7.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866759</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T8.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">9.3</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866776</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T9.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">18.0</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866756</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T10.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T11.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866814</td>
                                <td align="left" colspan="1" rowspan="1">Phenyl-</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T12.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866809</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T13.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T14.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">4.3</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866805</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T15.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T16.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866804</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T17.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T18.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">6.1</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866802</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T19.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T20.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">7.9</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866801</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T21.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T22.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">Cyano-</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="center" colspan="6" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T23.gif"/>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Identifier</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R1</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R2</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R3</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>EC
                                        <sub>50</sub> [&#x03bc;M]</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Known Datasets</bold>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866799</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T24.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T25.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">9.5</td>
                                <td align="left" colspan="1" rowspan="1">D</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866771</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T26.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T27.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">-</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866779</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T28.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T29.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">3.2</td>
                                <td align="left" colspan="1" rowspan="1">D</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866777</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T30.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T31.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">4.6</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866800</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T32.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T33.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">H</td>
                                <td align="left" colspan="1" rowspan="1">1.9</td>
                                <td align="left" colspan="1" rowspan="1">N</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="6" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T34.gif"/>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Identifier</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R1</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>R2</bold>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>EC
                                        <sub>50</sub> [&#x03bc;M]</bold>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <bold>Known Datasets</bold>
</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866807</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T35.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T36.gif"/>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">0.20</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866760</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T37.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T38.gif"/>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">0.72</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000866811</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T39.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T40.gif"/>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">0.92</td>
                                <td align="left" colspan="1" rowspan="1">P (not anti-
                                    <break/>malaria)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000377329</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T41.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T42.gif"/>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">2.0</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">SJ000377299</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T43.gif"/>
</td>
                                <td align="left" colspan="1" rowspan="1">

                                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_T44.gif"/>
</td>
                                <td colspan="1" rowspan="1"/>
                                <td align="left" colspan="1" rowspan="1">4.4</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Compounds from the Novartis-GNF Malaria Box
                            <sup>
                                <xref ref-type="bibr" rid="ref-40">40</xref>
                            </sup>, with an aminopyrimidine-type scaffold.</title>
                        <p>These compounds are similar to the group of compounds predicted by Workflow 1 with the same scaffold (
                            <xref ref-type="table" rid="T6">Table 6</xref>).</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure4.gif"/>
                </fig>
                <p>	The nine new compounds proposed by Workflow 2 are shown in 
                    <xref ref-type="fig" rid="f5">Figure 5</xref>. Five of them had been tested active previously in one of the ChEMBL-NTD assays or in PubChem assays for anti-malaria activity. Two compounds (SJ000866810 and SJ000866799) have the same 8-hydroxyquinoline-type scaffold as in Workflow 1, and one compound (SJ000866764) has a similar aminopyrimidine-type scaffold. Among the most active compounds predicted by both Workflows was a series of molecules with a benzothiazole scaffold (
                    <xref ref-type="fig" rid="f6">Figure 6</xref>). Compounds with a similar scaffold were tested previously in PubChem assays for anti-malaria activity or are part of the ChEMBL-NTD datasets. Compound 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/compound/inspect/CHEMBL1197797">SJ000040830</ext-link> showed also high anti-leishmanial activity
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup>. There may be, however, potential PAINS issues with this scaffold, although not covered by the current PAINS filters, as the extended &#x03c0;-system may act as Michael-like acceptor.</p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Nine compounds proposed by Workflow 2.</title>
                        <p>The molecules are ordered by decreasing activity.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure5.gif"/>
                </fig>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Compounds with a benzothiazole scaffold.</title>
                        <p>(Top): Compounds predicted by Workflow 1 and Workflow 2. (Bottom): Compounds that are actives from PubChem, Novartis-GNF Malaria Box
                            <sup>
                                <xref ref-type="bibr" rid="ref-40">40</xref>
                            </sup> and St. Jude Children's Research Hospital
                            <sup>
                                <xref ref-type="bibr" rid="ref-24">24</xref>
                            </sup>.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12868/0acd2a8e-684b-4da7-9596-e4bd089cef3b_figure6.gif"/>
                </fig>
            </sec>
            <sec sec-type="conclusions">
                <title>Conclusions</title>
                <p>The use of ligand-based VS based on results from a primary HTS to select new, potentially active compounds for testing is a common task in drug discovery. Here, we presented two detailed Workflows using open-source tools for educational purposes, and report the application of these Workflows for the identification of anti-malarial compounds as part of the 2014 TDT challenge. Information from a previous primary HTS performed at the St. Jude Children's Research Hospital (and a confirmatory screen in case of Workflow 2) was used for training. Of the 2000 compounds proposed by the Workflows, 114 were selected for follow-up testing based on availability. Excluding the two known anti-malarials quinidine and amodiaquine and the 31 compounds already present in the primary screen, 46 out of 81 new compounds were found to be active, which corresponds to a high hit rate of 57% and shows that the machine-learning methods in the presented Workflows both successfully identified scaffolds with anti-malaria activity. There was a good agreement between the two Workflows in the general scaffolds that were identified, even though the exact compounds and rankings were not the same. The most interesting group of compounds in the tested set contains an aminopyrimidine-type scaffold with a second 
                    <italic toggle="yes">N</italic>-alkyl substituent instead of a known 
                    <italic toggle="yes">N</italic>-aryl substituent. In particular, the most active compound SJ000866807 of this series shows good ligand efficiency.</p>
            </sec>
            <sec>
                <title>Data and software availability</title>
                <p>The tutorials are available on the TDT website (
                    <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org/2012-competition--tutorials.html">http://www.tdtproject.org/2012-competition--tutorials.html</ext-link>) and on GitHub (
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/sriniker/TDT-tutorial-2014">https://github.com/sriniker/TDT-tutorial-2014</ext-link> and 
                    <ext-link ext-link-type="uri" xlink:href="https://github.com/sdvillal/tdt-malaria-followup">https://github.com/sdvillal/tdt-malaria-followup</ext-link>). Both tutorials use only freely available software as specified above. The data from the primary HTS and confirmatory dose-response assay used in the TDT competition are available on the TDT website (
                    <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org/challenge-1---malaria-hts.html">http://www.tdtproject.org/challenge-1---malaria-hts.html</ext-link>) and are also deposited in ChEMBL, as part of the Neglected Tropical Diseases set (
                    <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chemblntd/">ChEMBL-NTD</ext-link>). The identifiers, SMILES, EC
                    <sub>50</sub> values and raw data for the held-out test set
                    <sup>
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>, as well as for the 114 compounds tested in this study, are given in the 
                    <xref ref-type="other" rid="SM1">Supplementary material</xref>.</p>
            </sec>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgements</title>
            <p>The authors thank eMolecules as TDT partner in the compound acquisition, and ChemBridge and Enamine for providing compounds. The authors also thank the other sponsors of TDT (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org/partners-and-sponsors.html">http://www.tdtproject.org/partners-and-sponsors.html</ext-link>) and the TDT steering committee (
                <ext-link ext-link-type="uri" xlink:href="http://www.tdtproject.org/steering-committee.html">http://www.tdtproject.org/steering-committee.html</ext-link>).</p>
        </ack>
        <sec id="SM1" sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p id="ST1">Supplementary Table S1: Identifiers, EC
                <sub>50</sub>values, final scores and ranks in the Workflows 1 and 2 for the 31 tested compounds that were part of the primary HTS screen.</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/11905/f11aa915-31bd-40e3-bf3c-31891c29787f.pdf">Click here to access the data.</ext-link>
            </p>
            <p>Supplementary Table S2: Identifiers, SMILES, EC
                <sub>50</sub> values and raw data for the 1056 molecules in the external held-out test set.</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/11905/fff35227-d8d6-48f6-888d-06ad4ebf95f8.xlsx">Click here to access the data.</ext-link>
            </p>
            <p>Supplementary Table S3: Identifiers, SMILES, EC
                <sub>50</sub> values and raw data for the 114 molecules tested in this study.</p>
            <p>
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/11905/90e265c7-9704-4792-b56e-3f539947840b.xlsx">Click here to access the data.</ext-link>
            </p>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jansen</surname>
                            <given-names>JM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Cornell</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tseng</surname>
                            <given-names>YJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Teach-Discover-Treat (TDT): collaborative computational drug discovery for neglected diseases.</article-title>
                    <source>

                        <italic toggle="yes">J Mol Graph Model.</italic>
</source>
                    <year>2012</year>;<volume>38</volume>:<fpage>360</fpage>&#x2013;<lpage>362</lpage>.
                    <pub-id pub-id-type="pmid">23085175</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jmgm.2012.07.007</pub-id>
                    <pub-id pub-id-type="pmcid">3508335</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Koes</surname>
                            <given-names>DR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pabon</surname>
                            <given-names>NA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Deng</surname>
                            <given-names>X</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A Teach-Discover-Treat Application of ZincPharmer: An Online Interactive Pharmacophore Modeling and Virtual Screening Tool.</article-title>
                    <source>

                        <italic toggle="yes">PLoS One.</italic>
</source>
                    <year>2015</year>;<volume>10</volume>(<issue>8</issue>):<fpage>e0134697</fpage>.
                    <pub-id pub-id-type="pmid">26258606</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pone.0134697</pub-id>
                    <pub-id pub-id-type="pmcid">4530941</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bender</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Glen</surname>
                            <given-names>RC</given-names>
                        </name>
</person-group>:
                    <article-title>Molecular similarity: a key technique in molecular informatics.</article-title>
                    <source>

                        <italic toggle="yes">Org Biomol Chem.</italic>
</source>
                    <year>2004</year>;<volume>2</volume>(<issue>22</issue>):<fpage>3204</fpage>&#x2013;<lpage>3218</lpage>.
                    <pub-id pub-id-type="pmid">15534697</pub-id>
                    <pub-id pub-id-type="doi">10.1039/B409813G</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sheridan</surname>
                            <given-names>RP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kearsley</surname>
                            <given-names>SK</given-names>
                        </name>
</person-group>:
                    <article-title>Why do we need so many chemical similarity search methods?</article-title>
                    <source>

                        <italic toggle="yes">Drug Discov Today.</italic>
</source>
                    <year>2002</year>;<volume>7</volume>(<issue>17</issue>):<fpage>903</fpage>&#x2013;<lpage>911</lpage>.
                    <pub-id pub-id-type="pmid">12546933</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S1359-6446(02)02411-X</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Roth</surname>
                            <given-names>HJ</given-names>
                        </name>
</person-group>:
                    <article-title>There is no such thing as &#x2018;diversity&#x2019;!</article-title>
                    <source>

                        <italic toggle="yes">Curr Opin Chem Biol.</italic>
</source>
                    <year>2005</year>;<volume>9</volume>(<issue>3</issue>):<fpage>293</fpage>&#x2013;<lpage>295</lpage>.
                    <pub-id pub-id-type="pmid">15939331</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cbpa.2005.03.002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bender</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>How similar are those molecules after all? Use two descriptors and you will have three different answers.</article-title>
                    <source>

                        <italic toggle="yes">Expert Opin Drug Discov.</italic>
</source>
                    <year>2010</year>;<volume>5</volume>(<issue>12</issue>):<fpage>1141</fpage>&#x2013;<lpage>1151</lpage>.
                    <pub-id pub-id-type="pmid">22822717</pub-id>
                    <pub-id pub-id-type="doi">10.1517/17460441.2010.517832</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Riniker</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Landrum</surname>
                            <given-names>GA</given-names>
                        </name>
</person-group>:
                    <article-title>Open-source platform to benchmark fingerprints for ligand-based virtual screening.</article-title>
                    <source>

                        <italic toggle="yes">J Cheminform.</italic>
</source>
                    <year>2013</year>;<volume>5</volume>(<issue>1</issue>):<fpage>26</fpage>.
                    <pub-id pub-id-type="pmid">23721588</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1758-2946-5-26</pub-id>
                    <pub-id pub-id-type="pmcid">3686626</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Harper</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bradshaw</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gittins</surname>
                            <given-names>JC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Prediction of biological activity for high-throughput screening using binary kernel discrimination.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Comput Sci.</italic>
</source>
                    <year>2001</year>;<volume>41</volume>(<issue>5</issue>):<fpage>1295</fpage>&#x2013;<lpage>1300</lpage>.
                    <pub-id pub-id-type="pmid">11604029</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci000397q</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hert</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname> Willett</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wilton</surname>
                            <given-names>DJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Model.</italic>
</source>
                    <year>2006</year>;<volume>46</volume>(<issue>2</issue>):<fpage>462</fpage>&#x2013;<lpage>470</lpage>.
                    <pub-id pub-id-type="pmid">16562973</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci050348j</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Geppert</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Horva&#x0301;th</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ga&#x0308;rtner</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Support-vector-machine-based ranking significantly improves the effectiveness of similarity searching using 2D fingerprints and multiple reference compounds.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Model.</italic>
</source>
                    <year>2008</year>;<volume>48</volume>(<issue>4</issue>):<fpage>742</fpage>&#x2013;<lpage>746</lpage>.
                    <pub-id pub-id-type="pmid">18318473</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci700461s</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Plewczynski</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Spieser</surname>
                            <given-names>SA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Koch</surname>
                            <given-names>U</given-names>
                        </name>
</person-group>:
                    <article-title>Performance of Machine Learning Methods for Ligand-Based Virtual Screening.</article-title>
                    <source>

                        <italic toggle="yes">Comb Chem High Throughput Screening.</italic>
</source>
                    <year>2009</year>;<volume>12</volume>(<issue>4</issue>):<fpage>358</fpage>&#x2013;<lpage>368</lpage>.
                    <pub-id pub-id-type="pmid">19442065</pub-id>
                    <pub-id pub-id-type="doi">10.2174/138620709788167962</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Riniker</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Fechner</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Landrum</surname>
                            <given-names>GA</given-names>
                        </name>
</person-group>:
                    <article-title>Heterogeneous Classifier Fusion for Ligand-Based Virtual Screening: Or How Decision Making by Committee Can Be a Good Thing.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Model.</italic>
</source>
                    <year>2013</year>;<volume>53</volume>(<issue>11</issue>):<fpage>2829</fpage>&#x2013;<lpage>2836</lpage>.
                    <pub-id pub-id-type="pmid">24171408</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci400466r</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <collab>WHO</collab>:
                    <article-title>World Malaria Report</article-title>.<year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.who.int/malaria/publications/world_malaria_report_2014/en/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Staines</surname>
                            <given-names>HM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Krishna</surname>
                            <given-names>S  (Eds.)</given-names>
                        </name>
</person-group>:
                    <article-title>Treatment and Prevention of Malaria</article-title>. Springer Verlag, Basel.<year>2012</year>.
                    <pub-id pub-id-type="doi">10.1007/978-3-0346-0480-2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chatterjee</surname>
                            <given-names>AK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yeung</surname>
                            <given-names>BK</given-names>
                        </name>
</person-group>:
                    <article-title>Back to the future: lessons learned in modern target-based and whole-cell lead optimization of antimalarials.</article-title>
                    <source>

                        <italic toggle="yes">Curr Topics Med Chem.</italic>
</source>
                    <year>2012</year>;<volume>12</volume>(<issue>5</issue>):<fpage>473</fpage>&#x2013;<lpage>483</lpage>.
                    <pub-id pub-id-type="pmid">22242845</pub-id>
                    <pub-id pub-id-type="doi">10.2174/156802612799362977</pub-id>
                    <pub-id pub-id-type="pmcid">3355380</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Biamonte</surname>
                            <given-names>MA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wanner</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Le Roch</surname>
                            <given-names>KG</given-names>
                        </name>
</person-group>:
                    <article-title>Recent advances in malaria drug discovery.</article-title>
                    <source>

                        <italic toggle="yes">Bioorg Med Chem Lett.</italic>
</source>
                    <year>2013</year>;<volume>23</volume>(<issue>10</issue>):<fpage>2829</fpage>&#x2013;<lpage>2843</lpage>.
                    <pub-id pub-id-type="pmid">23587422</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.bmcl.2013.03.067</pub-id>
                    <pub-id pub-id-type="pmcid">3762334</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Flannery</surname>
                            <given-names>EL</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chatterjee</surname>
                            <given-names>AK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Winzeler</surname>
                            <given-names>EA</given-names>
                        </name>
</person-group>:
                    <article-title>Antimalarial drug discovery - approaches and progress towards new medicines.</article-title>
                    <source>

                        <italic toggle="yes">Nat Rev Microbiol.</italic>
</source>
                    <year>2013</year>;<volume>11</volume>(<issue>12</issue>):<fpage>849</fpage>&#x2013;<lpage>862</lpage>.
                    <pub-id pub-id-type="pmid">24217412</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrmicro3138</pub-id>
                    <pub-id pub-id-type="pmcid">3941073</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Burrows</surname>
                            <given-names>JN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Burlot</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Campo</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Antimalarial drug discovery - the path towards eradication.</article-title>
                    <source>

                        <italic toggle="yes">Parasitology.</italic>
</source>
                    <year>2014</year>;<volume>141</volume>(<issue>1</issue>):<fpage>128</fpage>&#x2013;<lpage>139</lpage>.
                    <pub-id pub-id-type="pmid">23863111</pub-id>
                    <pub-id pub-id-type="doi">10.1017/S0031182013000826</pub-id>
                    <pub-id pub-id-type="pmcid">3884835</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wells</surname>
                            <given-names>TN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hooft van Huijsduijnen</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Van Voorhis</surname>
                            <given-names>WC</given-names>
                        </name>
</person-group>:
                    <article-title>Malaria medicines: a glass half full?</article-title>
                    <source>

                        <italic toggle="yes">Nat Rev Drug Discov.</italic>
</source>
                    <year>2015</year>;<volume>14</volume>(<issue>6</issue>):<fpage>424</fpage>&#x2013;<lpage>442</lpage>.
                    <pub-id pub-id-type="pmid">26000721</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrd4573</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Morris</surname>
                            <given-names>CA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Duparc</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Borghini-Fuhrer</surname>
                            <given-names>I</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Review of the clinical pharmacokinetics of artesunate and its active metabolite dihydroartemisinin following intravenous, intramuscular, oral or rectal administration.</article-title>
                    <source>

                        <italic toggle="yes">Malaria J.</italic>
</source>
                    <year>2011</year>;<volume>10</volume>:<fpage>263</fpage>.
                    <pub-id pub-id-type="pmid">21914160</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1475-2875-10-263</pub-id>
                    <pub-id pub-id-type="pmcid">3180444</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hastings</surname>
                            <given-names>IM</given-names>
                        </name>
</person-group>:
                    <article-title>The origins of antimalarial drug resistance.</article-title>
                    <source>

                        <italic toggle="yes">Trends Parasitol.</italic>
</source>
                    <year>2004</year>;<volume>20</volume>(<issue>11</issue>):<fpage>512</fpage>&#x2013;<lpage>518</lpage>.
                    <pub-id pub-id-type="pmid">15471702</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.pt.2004.08.006</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Klein</surname>
                            <given-names>EY</given-names>
                        </name>
</person-group>:
                    <article-title>Antimalarial drug resistance: a review of the biology and strategies to delay emergence and spread.</article-title>
                    <source>

                        <italic toggle="yes">Int J Antimicrob Agents.</italic>
</source>
                    <year>2013</year>;<volume>41</volume>(<issue>4</issue>):<fpage>311</fpage>&#x2013;<lpage>317</lpage>.
                    <pub-id pub-id-type="pmid">23394809</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.ijantimicag.2012.12.007</pub-id>
                    <pub-id pub-id-type="pmcid">3610176</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Guiguemde</surname>
                            <given-names>WA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shelat</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bouck</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Chemical genetics of 
                        <italic toggle="yes">Plasmodium falciparum.</italic>
                    </article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2010</year>;<volume>465</volume>(<issue>7296</issue>):<fpage>311</fpage>&#x2013;<lpage>315</lpage>.
                    <pub-id pub-id-type="pmid">20485428</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature09099</pub-id>
                    <pub-id pub-id-type="pmcid">2874979</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Guiguemde</surname>
                            <given-names>WA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shelat</surname>
                            <given-names>AA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Garcia-Bustos</surname>
                            <given-names>JF</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Global Phenotypic Screening for Antimalarials.</article-title>
                    <source>

                        <italic toggle="yes">Chem Biol.</italic>
</source>
                    <year>2012</year>;<volume>19</volume>(<issue>1</issue>):<fpage>116</fpage>&#x2013;<lpage>129</lpage>.
                    <pub-id pub-id-type="pmid">22284359</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.chembiol.2012.01.004</pub-id>
                    <pub-id pub-id-type="pmcid">3269778</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smithson</surname>
                            <given-names>DC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Clark</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Connelly</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Held-out test set with 1056 molecules, to be published</article-title>.<year>2012</year>.</mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Walters</surname>
                            <given-names>WP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Namchuk</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Designing screens: how to make your hits a hit.</article-title>
                    <source>

                        <italic toggle="yes">Nat Rev Drug Discov.</italic>
</source>
                    <year>2003</year>;<volume>2</volume>(<issue>4</issue>):<fpage>259</fpage>&#x2013;<lpage>266</lpage>.
                    <pub-id pub-id-type="pmid">12669025</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrd1063</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Davies</surname>
                            <given-names>JW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Glick</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jenkins</surname>
                            <given-names>JL</given-names>
                        </name>
</person-group>:
                    <article-title>Streamlining lead discovery by aligning 
                        <italic toggle="yes">in silico</italic> and high-throughput screening.</article-title>
                    <source>

                        <italic toggle="yes">Curr Opin Chem Biol.</italic>
</source>
                    <year>2006</year>;<volume>10</volume>(<issue>4</issue>):<fpage>343</fpage>&#x2013;<lpage>351</lpage>.
                    <pub-id pub-id-type="pmid">16822701</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.cbpa.2006.06.022</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Baell</surname>
                            <given-names>JB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Holloway</surname>
                            <given-names>GA</given-names>
                        </name>
</person-group>:
                    <article-title>New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays.</article-title>
                    <source>

                        <italic toggle="yes">J Med Chem.</italic>
</source>
                    <year>2010</year>;<volume>53</volume>(<issue>7</issue>):<fpage>2719</fpage>&#x2013;<lpage>2740</lpage>.
                    <pub-id pub-id-type="pmid">20131845</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm901137j</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Butina</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Unsupervised Data Base Clustering Based on Daylight's Fingerprint and Tanimoto Similarity: A Fast and Automated Way To Cluster Small and Large Data Sets.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Comput Sci.</italic>
</source>
                    <year>1999</year>;<volume>39</volume>(<issue>4</issue>):<fpage>747</fpage>&#x2013;<lpage>750</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ci9803381</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Breiman</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>Random forests.</article-title>
                    <source>

                        <italic toggle="yes">Mach Learn.</italic>
</source>
                    <year>2001</year>;<volume>45</volume>(<issue>1</issue>):<fpage>5</fpage>&#x2013;<lpage>32</lpage>.
                    <pub-id pub-id-type="doi">10.1023/A:1010933404324</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Carhart</surname>
                            <given-names>RE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Smith</surname>
                            <given-names>DH</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Venkataraghavan</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>:
                    <article-title>Atom Pairs as Molecular Features in Structure-Activity Studies: Definition and Applications.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Comput Sci.</italic>
</source>
                    <year>1985</year>;<volume>25</volume>(<issue>2</issue>):<fpage>64</fpage>&#x2013;<lpage>73</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ci00046a002</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Rogers</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hahn</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Extended-connectivity fingerprints.</article-title>
                    <source>

                        <italic toggle="yes">J Chem Inf Model.</italic>
</source>
                    <year>2010</year>;<volume>50</volume>(<issue>5</issue>):<fpage>742</fpage>&#x2013;<lpage>754</lpage>.
                    <pub-id pub-id-type="pmid">20426451</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci100050t</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Geurts</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ernst</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wehenkel</surname>
                            <given-names>L</given-names>
                        </name>
</person-group>:
                    <article-title>Extremely Randomized Trees.</article-title>
                    <source>

                        <italic toggle="yes">Mach Learn.</italic>
</source>
                    <year>2006</year>;<volume>63</volume>(<issue>1</issue>):<fpage>3</fpage>&#x2013;<lpage>42</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s10994-006-6226-1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wolpert</surname>
                            <given-names>DH</given-names>
                        </name>
</person-group>:
                    <article-title>Stacked Generalization.</article-title>
                    <source>

                        <italic toggle="yes">Neural Netw.</italic>
</source>
                    <year>1992</year>;<volume>5</volume>(<issue>2</issue>):<fpage>241</fpage>&#x2013;<lpage>259</lpage>.
                    <pub-id pub-id-type="doi">10.1016/S0893-6080(05)80023-1</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sill</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Takacs</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Mackey</surname>
                            <given-names>L</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Feature-Weighted Linear Stacking</article-title>.
                    <italic toggle="yes">ArXiv e-print,</italic>0911.0460.<year>2009</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/pdf/0911.0460.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Trager</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jensen</surname>
                            <given-names>JB</given-names>
                        </name>
</person-group>:
                    <article-title>Human malaria parasites in continuous culture.</article-title>
                    <source>

                        <italic toggle="yes">Science.</italic>
</source>
                    <year>1976</year>;<volume>193</volume>(<issue>4254</issue>):<fpage>673</fpage>&#x2013;<lpage>675</lpage>.
                    <pub-id pub-id-type="pmid">781840</pub-id>
                    <pub-id pub-id-type="doi">10.1126/science.781840</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smilkstein</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sriwilaijaroen</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kelly</surname>
                            <given-names>JX</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Simple and inexpensive fluorescence-based technique for high-throughput antimalarial drug screening.</article-title>
                    <source>

                        <italic toggle="yes">Antimicrob Agents Chemother.</italic>
</source>
                    <year>2004</year>;<volume>48</volume>(<issue>5</issue>):<fpage>1803</fpage>&#x2013;<lpage>1806</lpage>.
                    <pub-id pub-id-type="pmid">15105138</pub-id>
                    <pub-id pub-id-type="doi">10.1128/AAC.48.5.1803-1806.2004 </pub-id>
                    <pub-id pub-id-type="pmcid">400546</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ritz</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Streibig</surname>
                            <given-names>JC</given-names>
                        </name>
</person-group>:
                    <article-title>Bioassay Analysis using R.</article-title>
                    <source>

                        <italic toggle="yes">J Stat Softw.</italic>
</source>
                    <year>2005</year>;<volume>12</volume>(<issue>5</issue>):<fpage>22</fpage>.
                    <pub-id pub-id-type="doi">10.18637/jss.v012.i05</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bemis</surname>
                            <given-names>GW</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Murcko</surname>
                            <given-names>MA</given-names>
                        </name>
</person-group>:
                    <article-title>The properties of known drugs. 1. Molecular frameworks.</article-title>
                    <source>

                        <italic toggle="yes">J Med Chem.</italic>
</source>
                    <year>1996</year>;<volume>39</volume>(<issue>15</issue>):<fpage>2887</fpage>&#x2013;<lpage>2893</lpage>.
                    <pub-id pub-id-type="pmid">8709122</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm9602928</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <collab>Novartis-GNF Malaria Box, </collab>
                    <person-group person-group-type="author">						

                        <name name-style="western">
                            <surname>Gagaring </surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Borboa</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genomics Institute of the Novartis Research Foundation (GNF), 10675 John Jay Hopkins Drive, San Diego CA 92121, USA and Novartis Institute for Tropical Disease, 10 Biopolis Road, Chromos # 05-01, 138 670 Singapore</article-title>.</mixed-citation>
            </ref>
            <ref id="ref-41">
                <label>41</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Gamo</surname>
                            <given-names>FJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Sanz</surname>
                            <given-names>LM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Vidal</surname>
                            <given-names>J</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Thousands of chemical starting points for antimalarial lead identification.</article-title>
                    <source>

                        <italic toggle="yes">Nature.</italic>
</source>
                    <year>2010</year>;<volume>465</volume>(<issue>7296</issue>):<fpage>305</fpage>&#x2013;<lpage>310</lpage>.
                    <pub-id pub-id-type="pmid">20485427</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nature09107</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report24288">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12868.r24288</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Volkamer</surname>
                        <given-names>Andrea</given-names>
                    </name>
                    <xref ref-type="aff" rid="r24288a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r24288a1">
                    <label>1</label>Institute of Physiology, Charit&#x00e9; - Universit&#x00e4;tsmedizin, Berlin, Germany</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>24</day>
                <month>10</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Volkamer A</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport24288" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.11905.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors introduce the two winning workflows (WFs) from the Teach-Discover-Treat competition 2014 featuring ligand-based virtual screening pipelines for anti-malaria compounds together with results from an experimental follow-up study. The workflows show hit rates of 57% (52% WF1, 100% WF2) in a prospective study on a rather small sample size due to compound availability and funding reasons (72 WF1/ 9 WF2 novel compounds). &#x00a0;The article is well written, easy to understand and the study showed promising results in finding new active compounds. Furthermore, both workflows are available to the community.</p>
            <p> </p>
            <p> Minor comments that could be addressed to improve the manuscript:</p>
            <p> </p>
            <p> Methods/Workflows:</p>
            <p> - More detail on the more advanced ML techniques, number of features, and especially feature importances would be helpful for the reader.</p>
            <p> - The data set is highly unbalanced (~1.5K actives vs. 290K inactives). Could one expect a boost in performance when using under-/oversampling methods?</p>
            <p> &#x00a0;- This may have been addressed in the competition itself, but it would be interesting to see how a simple model performs on the data, for example simply ranking by similarity to known actives?</p>
            <p> </p>
            <p> Evaluation:</p>
            <p> - The authors admit the little flaw in the original WF1 that the top 1000 molecules accidently represent a random selection from the top 10K. It&#x2019;s hard to compare the results now that the selection and testing phase is over, but it would be nice to see some evidence that the intended selection strategy would actually have been superior. E.g. the positions of the intended ranking could be included in Table 5, the prospectively tested set is small but a trend may become apparent?</p>
            <p> - Since 31 of the selected 114 compounds (~30%) were present in the HTS, I&#x2019;m wondering how many of the HTS compounds were present in eMolecules in total? &#x00a0;Because they have been used for training, taking them out of the evaluation would be more convincing and would probably also improve the rankings of the tested compounds. &#x00a0;</p>
            <p> - The authors claim that the two methods pick generally similar compounds, which can be somehow expected from the design of the two WFs (similar MLs and fingerprints). Nevertheless, this trend is not obvious to me from the few mentioned values, e.g., 7 compounds selected from WF2 are in top 10K from WF1 (page 8). It would be more meaningful to calculate the overlap of the top 1000 compounds between the methods or the similarity between these compounds (also with respect to different top 1000 selections in WF1, see point above).</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Structural bioinformatics/computational chemistry</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment3425-24288">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Riniker</surname>
                            <given-names>Sereina</given-names>
                        </name>
                        <aff>ETH Z&#x00fc;rich, Switzerland</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>13</day>
                    <month>2</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We follow the order of the points raised by the reviewer.&#x00a0; 
                    <list list-type="bullet">
                        <list-item>
                            <p>For both workflows, the code is freely available on GitHub (URLs are given in the paper).</p>
                        </list-item>
                        <list-item>
                            <p>Dataset imbalance: Workflow 1 uses an undersampling method for the random forest to deal with this issue. We will add a sentence describing this in the revised version. No over- or undersampling method was used for the models in Workflow 2. However, the logistic regression models did include an instance weighting scheme that gives more weight to instances of the minority class (i.e. C is modified per sample proportionally to the proportion of its class in the dataset). We will add in the Supplementary Material to show that class imbalance correction did not influence the model performance in Workflow 2.</p>
                        </list-item>
                        <list-item>
                            <p>To our knowledge this comparison has not been done in the competition itself for the external held-out set. In Task 2 of Workflow 1, the performance of the ML models is compared to simple ranking by similarity (see Table 2). The latter showed already a good baseline performance such that only random forest models were able to outperform it. We will add a sentence discussing this.</p>
                        </list-item>
                        <list-item>
                            <p>The question is what the comparison would be to define superiority without results for all compounds. Ideally, the enrichment in the selected 1000 compounds versus in the direct top 1000 should be compared but that is not possible anymore at this stage. It is important to stress that the selection procedure was aimed at improving the SAR information in the selected 1000 compounds not necessarily at increasing the hit rate, because the top 10'000 are already the highest ranked compounds among 5.5 millions in eMolecules. We will add a sentence regarding this in the revised version. We will replace column "Rank (1000) Workflow 1" in Table 5 with a column "Proposed by Workflow" to provide the information about which workflow proposed which molecules. Further, we will add a separate file in the Supplementary Material with the 1000 molecules selected by the correct Workflow 1.&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>We agree with the reviewer that removing the HTS compounds from the eMolecules catalogue prior to ranking should have been done, but for the present work it is unfortunately too late.</p>
                        </list-item>
                        <list-item>
                            <p>52% of the (true) top 1000 compounds of Workflow 1 have a similar compound in the top 1000 of Workflow 2, using Morgan2 fingerprints (radius = 2, 4096 bits) and a Tanimoto similarity cut-off = 0.8. We will add a figure with the distribution of the similarity value between each compound of Workflow 1 and its most similar compound in Workflow 2 in the Supplementary Material together with a short discussion in the main text.&#x00a0;</p>
                        </list-item>
                    </list> We thank the reviewer for carefully reading the manuscript and for the constructive and insightful feedback.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report24675">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12868.r24675</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Baumgartner</surname>
                        <given-names>Matthew&#x00a0;P.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r24675a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1279-9038</uri>
                </contrib>
                <aff id="r24675a1">
                    <label>1</label>Eli Lilly and Company, Windlesham, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>16</day>
                <month>8</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Baumgartner M</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport24675" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.11905.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The authors report on their participation in the 2014 Teach-Discover-Treat (TDT) initiative. The goal of the TDT is to encourage the creation of practical tutorials for computational chemistry.&#x00a0;The authors present the two workflow tutorials&#x00a0;that they developed for Challenge 1 of the competition. The challenge involved three tasks: analyzing single-point phenotypic HTS results and follow-up dose-response data for a subset of the compounds, building a predictive model of the anti-malaria activity and using that predictive model to select compounds from a set of commercially available compounds for prospective testing. The first workflow&#x00a0;presented by the authors only used the HTS data for its predictions and the second used both the HTS and dose-response data.&#x00a0;</p>
            <p> </p>
            <p> Overall I think that the paper is a thorough and easy-to-follow description of the methods and results of the two workflows, but there a few items that I feel require revisions.&#x00a0;</p>
            <p> </p>
            <p> </p>
            <p> Page 5, a brief description of what "heterogeneous classifier fusion" is would be appreciated</p>
            <p> </p>
            <p> Page 6. The authors should list the total number of features&#x00a0;that they use as descriptors in workflow 2.&#x00a0;</p>
            <p> </p>
            <p> Page 6. When building the random forests and extremely randomized trees of varying sizes, the ensembles of trees with 6000 trees (the largest number tested) were shown to preform best. The authors should explain why they did not try a higher number of trees.&#x00a0;</p>
            <p> </p>
            <p> Page 6. In the "Task 2..." paragraph. The authors should state what the resulting linear combination of the models was. The ratio would be interesting to know.</p>
            <p> </p>
            <p> Page 8, in the paragraph starting "The results for the remaining...". &#x00a0;It states in the text that there were 9 compounds predicted by workflow 2 that tested, but in Table 5, there are 10 compounds from workflow 2 listed. This should be corrected or clarified.&#x00a0;</p>
            <p> </p>
            <p> Page 8 and Table 5. As the compounds from Workflow 1 were selected randomly due to an error, is it meaningful to list their rankings at all?</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>computational chemistry</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment3424-24675">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Riniker</surname>
                            <given-names>Sereina</given-names>
                        </name>
                        <aff>ETH Z&#x00fc;rich, Switzerland</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>13</day>
                    <month>2</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We follow the order of the points raised by the reviewer. 
                    <list list-type="bullet">
                        <list-item>
                            <p>We will add a sentence about "heterogeneous classifier fusion" in the revised paper, including a reference to Ref. 13, which contains a detailed discussion of this concept.</p>
                        </list-item>
                        <list-item>
                            <p>Descriptors in Workflow 2: The RDKit descriptor set consists of 196 features and the unfolded fingerprints set, after removal of redundant features, 1'265'410 different substructures. We will add this information in the revised version.</p>
                        </list-item>
                        <list-item>
                            <p>Workflow 2: Larger numbers of trees were not investigated due to limited computer resources. However, a plateau in the performance curve was observed after 2000 trees, thus only small improvement can be expected for models with more than 6000 trees. We will add a figure to the Supplementary Material.</p>
                        </list-item>
                        <list-item>
                            <p>Workflow 2, Task 2: The linear combination in Workflow 2 placed substantially more weight on the tree models (coefficient 1.07) than on the logistic regression models (coefficient 0.07). We found later that this weighting was not optimal for the prediction of the held-out test set. Logistic regression models alone would have performed better than the original submission from Workflow 2. We will add a table with this information in the Supplementary Material and a comment in the main text.</p>
                        </list-item>
                        <list-item>
                            <p>There were nine new compounds and one known anti-malarial amodiaguine, i.e. together ten compounds. We will add the term "new" in one sentence of the corresponding paragraph to make it clearer.</p>
                        </list-item>
                        <list-item>
                            <p>Table 5: We agree that they are not true rankings anymore, but because this list served as the input for compound selection the rankings are used in Table 5 to mark which molecules came from Workflow 1. We will replace column "Rank (1000) Workflow 1" with a column "Proposed by Workflow" to provide the same information.</p>
                        </list-item>
                    </list> We thank the reviewer for carefully reading the manuscript and for the constructive and insightful feedback.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report24285">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12868.r24285</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Koes</surname>
                        <given-names>David&#x00a0;Ryan&#x00a0;</given-names>
                    </name>
                    <xref ref-type="aff" rid="r24285a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6892-6614</uri>
                </contrib>
                <aff id="r24285a1">
                    <label>1</label>Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>2</day>
                <month>8</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Koes D</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport24285" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.11905.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The report describes the two winning ligand-based virtual screening methods of the 2014 Teach-Discover-Treat exercise. &#x00a0;Both workflows and the supporting data are available in their entirety online (behind a&#x00a0;request for the user's email address) and are adequately described in the manuscript. &#x00a0;The report is of general interest to the community and a useful resource to practitioners in ligand-based drug discovery.</p>
            <p> </p>
            <p> I have a few minor suggestions for strengthening the manuscript.</p>
            <p> </p>
            <p> Page 4. A few sentences describing "heterogeneous classifier fusion" would be appreciated.</p>
            <p> </p>
            <p> Page 5. &#x00a0;Descriptors. I would be interested in knowing the number of bits (i.e. unique ECFP/FCFP fragments)&#x00a0;required to represent the full dataset (that is, the number of features in the input, which I suspect is actually larger than the number of examples?).</p>
            <p> </p>
            <p> Page 5. Task 2. &#x00a0;The weights for the two models found by the linear regression would be interesting to report (is one model favored more heavily than the other?).</p>
            <p> </p>
            <p> Table 4. &#x00a0;This would be a bit more informative if variance was reported as well.</p>
            <p> </p>
            <p> Figure 2. &#x00a0;It isn't clear to me exactly what this is reporting. &#x00a0;Is this the distribution of all possible pairs between the two sets? Please clarify.</p>
            <p> </p>
            <p> Table 5. &#x00a0;My understanding is that the Rank (1000) numbers are essentially meaningless as the compounds were (accidentally) randomly selected. &#x00a0;Can the corrected top 1000 ranks &#x00a0;be provided as well (or instead) and clearly labeled as such (realizing that not all compounds will have such a rank). &#x00a0;</p>
            <p> </p>
            <p> It's also hard to get a sense of enrichment from these numbers since only 114 compounds were tested but the ranks have a much larger span. &#x00a0;For example, the workflow 2 active compounds have poor ranks (&gt;500), but this is misleading since there were no highly ranked (novel) compounds tested. &#x00a0;I would really appreciate some visualization of enrichment relative to ranking (e.g. ROC curve) for 114/81 compounds tested for workflow 1.</p>
            <p>Is the rationale for developing the new method (or application) clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the method technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions about the method and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>If any results are presented, are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Are sufficient details provided to allow replication of the method development and its use by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>computational drug discovery</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment3423-24285">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Riniker</surname>
                            <given-names>Sereina</given-names>
                        </name>
                        <aff>ETH Z&#x00fc;rich, Switzerland</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>13</day>
                    <month>2</month>
                    <year>2018</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We follow the order of the points raised by the reviewer. 
                    <list list-type="bullet">
                        <list-item>
                            <p>We will add a sentence about "heterogeneous classifier fusion" in the revised version, including a reference to Ref. 13, which contains a detailed discussion of this concept.</p>
                        </list-item>
                        <list-item>
                            <p>Descriptors in Workflow 1: The actives in the primary HTS set yield 5935 unique Morgan2 fragments. Together with the inactives there are 56'351 unique fragments. The number of 4096 bits used for the folded Morgan2 fingerprints in Workflow 1 is clearly much smaller than the number of unique fragments, however, a balance must be found between the size of the fingerprint (and associated computational cost) and the number of collisions. In our case, we found that a size of 4096 bits presents a good compromise.</p>
                        </list-item>
                        <list-item>
                            <p>Workflow 1, Task 2: The weights of all models in the classifier fusion of Workflow 1 were the same. The MAX rank was used.</p>
                        </list-item>
                        <list-item>
                            <p>We will add the standard deviation to Table 4.</p>
                        </list-item>
                        <list-item>
                            <p>Figure 2 reports the distribution of all possible pairs between the two sets. We will adapt the legend of Figure 2 to make it clearer.</p>
                        </list-item>
                        <list-item>
                            <p>Table 5: As the list with ranks served as the input for compound selection, the rankings are used in Table 5 to mark which molecules came from Workflow 1. We will replace column "Rank (1000) Workflow 1" with a column "Proposed by Workflow" to provide the same information. We will also add a separate file in the Supplementary Material with the 1000 molecules selected by the correct Workflow 1.</p>
                        </list-item>
                        <list-item>
                            <p>Both enrichment factors and ROC curves compare the ranking of active/inactive molecules against random distribution. The number of actives is exceptionally high in the set of 81 compounds (i.e. 58 %), because these were selected among the highest-ranked compounds from the 5.5 millions in eMolecules. Due to this high percentage of actives in the list, the calculation of enrichment factors or ROC curves does not make much sense in the present case.</p>
                        </list-item>
                    </list> We thank the reviewer for carefully reading the manuscript and for the constructive and insightful feedback.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
