<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="data-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.3713.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Data Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                    <subj-group>
                        <subject>Biomimetic Chemistry</subject>
                    </subj-group>
                    <subj-group>
                        <subject>Macromolecular Chemistry</subject>
                    </subj-group>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Compound data sets and software tools for chemoinformatics and medicinal chemistry applications: update and data transfer</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 3 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Hu</surname>
                        <given-names>Ye</given-names>
                    </name>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Bajorath</surname>
                        <given-names>J&#x00fc;rgen</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0002-0557-5714</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms University, Bonn, D-53113, Germany</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:bajorath@bit.uni-bonn.de">bajorath@bit.uni-bonn.de</email>
                </corresp>
                <fn fn-type="con">
                    <p>JB designed the study, YH collected and organized the data, YH and JB wrote the manuscript.</p>
                </fn>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were declared.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>11</day>
                <month>3</month>
                <year>2014</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2014</year>
            </pub-date>
            <volume>3</volume>
            <elocation-id>69</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>7</day>
                    <month>3</month>
                    <year>2014</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2014 Hu Y and Bajorath J</copyright-statement>
                <copyright-year>2014</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/3-69/pdf"/>
            <related-article elocation-id="10.12688/f1000research.1-11.v1" id="related-article-version-111" journal-id="F1000Research" journal-id-type="pmc" related-article-type="companion" vol="1">
                <article-title>Freely available compound data sets and software tools for chemoinformatics and computational medicinal chemistry applications</article-title>
                <pub-id pub-id-type="doi">10.12688/f1000research.1-11.v1</pub-id>
            </related-article>
            <abstract>
                <p>In 2012, we reported 30 compound data sets and/or programs developed in our laboratory in a data article and made them freely available to the scientific community to support chemoinformatics and computational medicinal chemistry applications. These data sets and computational tools were provided for download from our website. Since publication of this data article, we have generated 13 new data sets with which we further extend our collection of publicly available data and tools. Due to changes in web servers and website architectures, data accessibility has recently been limited at times. Therefore, we have also transferred our data sets and tools to a public repository to ensure full and stable accessibility. To aid in data selection, we have classified the data sets according to scientific subject areas. Herein, we describe new data sets, introduce the data organization scheme, summarize the database content and provide detailed access information in ZENODO (doi: 
                    <ext-link ext-link-type="uri" xlink:href="https://zenodo.org/record/8451/usage#.Uxc_sGePPcs">10.5281/zenodo.8451</ext-link> and doi:
                    <ext-link ext-link-type="uri" xlink:href="https://zenodo.org/record/8455?ln=en#.Uxc_9Pl_unM">10.5281/zenodo.8455</ext-link>).</p>
            </abstract>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>The compound data sets reported in our original article
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup> and the new data sets presented herein have resulted from research in the chemoinformatics and medicinal chemistry area and have mostly been generated from public domain repositories of compound structures and activity data. In addition, software tools made publicly available have also been developed in our laboratory
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>. Data sets reported in the scientific literature in the context of computational method development and evaluation are often not publicly available, which limits the reproducibility of computational investigations and comparisons of different computational methods. We believe that it is important to provide such data to the scientific community to further improve the transparency and credibility of computational studies and support method development. In addition to the data sets designed for the development and evaluation of computational methods, we also make available data sets that were generated as a resource and knowledge base for medicinal chemistry applications. Our data sets and tools are provided via the ZENODO platform (
                <ext-link ext-link-type="uri" xlink:href="https://zenodo.org/">https://zenodo.org/</ext-link>) to ensure easy and stable access.</p>
        </sec>
        <sec sec-type="materials | methods">
            <title>Materials and methods</title>
            <p>The data sets reported herein were predominantly generated from 
                <ext-link ext-link-type="uri" xlink:href="https://www.ebi.ac.uk/chembl/">ChEMBL</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>,
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, 
                <ext-link ext-link-type="uri" xlink:href="http://www.bindingdb.org/bind/index.jsp">BindingDB</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup> and 
                <ext-link ext-link-type="uri" xlink:href="https://pubchem.ncbi.nlm.nih.gov/">PubChem</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup> (a few exceptions are specified in the original data article
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>). Compound structures are represented as SMILES
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup> strings or SD files
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>. Activity information and other (data set-dependent) annotations are provided in the individual data files. For software tools (written in different languages), the source code is also made available.</p>
        </sec>
        <sec>
            <title>Data description</title>
            <p>
                <xref ref-type="table" rid="T1">Table 1</xref> provides the updated list and classification of all freely available data sets and programs. Entries were organized according to the following scientific subject areas: data sets for structure-activity relationship (SAR) and structure-selectivity relationship (SSR) analysis, SAR visualization (SAR_VZ), and virtual screening via similarity searching or machine learning (VS_ML). In addition, the programs are provided separately (PROG). Data sets and programs are contained in separate ZENODO deposition sets with a unique reference. Three matched molecular pair (MMP)-based data sets also included in our update have recently been reported and described in detail
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>. Entries 1&#x2013;30 in 
                <xref ref-type="table" rid="T1">Table 1</xref> represent the data sets and programs that we initially provided 
                <italic toggle="yes">via</italic> our website
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup> and entries 31&#x2013;43 represent new data sets. In the following, the new data sets are described:</p>
            <table-wrap id="T1" orientation="portrait" position="anchor">
                <label>Table 1. </label>
                <caption>
                    <title>Data sets and programs.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Entry</th>
                            <th align="left" colspan="1" rowspan="1">Year</th>
                            <th align="left" colspan="1" rowspan="1">Subject area
                                <break/>index label</th>
                            <th align="left" colspan="1" rowspan="1">Description</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td colspan="1" rowspan="1">1
                                <sup>
                                    <xref ref-type="bibr" rid="ref-9">[9]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2007</td>
                            <td colspan="1" rowspan="1">VS_ML_1</td>
                            <td colspan="1" rowspan="1">9 activity classes (AC) with increasing structural diversity</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">2
                                <sup>
                                    <xref ref-type="bibr" rid="ref-9">[9]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2007</td>
                            <td colspan="1" rowspan="1">VS_ML_2</td>
                            <td colspan="1" rowspan="1">~1.44 million ZINC compounds used for various virtual screening trials</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">3
                                <sup>
                                    <xref ref-type="bibr" rid="ref-10">[10]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2007</td>
                            <td colspan="1" rowspan="1">PROG_1</td>
                            <td colspan="1" rowspan="1">Molecular similarity histogram filtering</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">4
                                <sup>
                                    <xref ref-type="bibr" rid="ref-11">[11]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2007</td>
                            <td colspan="1" rowspan="1">SSR_1</td>
                            <td colspan="1" rowspan="1">4 SD files with 26 selectivity sets; compounds are annotated with selectivity values for different targets</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">5
                                <sup>
                                    <xref ref-type="bibr" rid="ref-12">[12]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2008</td>
                            <td colspan="1" rowspan="1">SSR_2</td>
                            <td colspan="1" rowspan="1">7 compound selectivity sets containing 267 biogenic amine GPCR antagonists</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">6
                                <sup>
                                    <xref ref-type="bibr" rid="ref-13">[13]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2008</td>
                            <td colspan="1" rowspan="1">SSR_3</td>
                            <td colspan="1" rowspan="1">18 selectivity sets for targets from 4 families</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">7
                                <sup>
                                    <xref ref-type="bibr" rid="ref-14">[14]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2008</td>
                            <td colspan="1" rowspan="1">VS_ML_3</td>
                            <td colspan="1" rowspan="1">25 sets of compounds of increasing complexity and size</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">8
                                <sup>
                                    <xref ref-type="bibr" rid="ref-15">[15]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2009</td>
                            <td colspan="1" rowspan="1">VS_ML_4</td>
                            <td colspan="1" rowspan="1">242 hERG inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">9
                                <sup>
                                    <xref ref-type="bibr" rid="ref-16">[16]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2009</td>
                            <td colspan="1" rowspan="1">SSR_4</td>
                            <td colspan="1" rowspan="1">243 ionotropic glutamate ion channel antagonists</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">10
                                <sup>
                                    <xref ref-type="bibr" rid="ref-17">[17]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2009</td>
                            <td colspan="1" rowspan="1">PROG_2</td>
                            <td colspan="1" rowspan="1">Combinatorial analog graph (CAG) program with a sample set consisting of 51 thrombin inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">11
                                <sup>
                                    <xref ref-type="bibr" rid="ref-18">[18]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2009</td>
                            <td colspan="1" rowspan="1">VS_ML_5</td>
                            <td colspan="1" rowspan="1">20 AC from the literature and 15 AC from the Molecular Drug Data Report</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">12
                                <sup>
                                    <xref ref-type="bibr" rid="ref-19">[19]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">VS_ML_6</td>
                            <td colspan="1" rowspan="1">8 AC</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">13
                                <sup>
                                    <xref ref-type="bibr" rid="ref-20">[20]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">PROG_3</td>
                            <td colspan="1" rowspan="1">Program to generate target selectivity patterns of scaffolds</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">14
                                <sup>
                                    <xref ref-type="bibr" rid="ref-21">[21]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">PROG_4</td>
                            <td colspan="1" rowspan="1">Multi-target CAGs (see also entry 10) with a sample set containing 33 kinase inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">15
                                <sup>
                                    <xref ref-type="bibr" rid="ref-22">[22]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">PROG_5</td>
                            <td colspan="1" rowspan="1">SARANEA</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">16
                                <sup>
                                    <xref ref-type="bibr" rid="ref-23">[23]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">PROG_6</td>
                            <td colspan="1" rowspan="1">3D activity landscape program with a sample set containing 248 cathepsin S inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">17
                                <sup>
                                    <xref ref-type="bibr" rid="ref-24">[24]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">SAR_1</td>
                            <td colspan="1" rowspan="1">2 sets of MMPs from BindingDB and ChEMBL</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">18
                                <sup>
                                    <xref ref-type="bibr" rid="ref-25">[25]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">PROG_7</td>
                            <td colspan="1" rowspan="1">Similarity-potency tree (SPT) program with a sample set containing 874 factor Xa inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">19
                                <sup>
                                    <xref ref-type="bibr" rid="ref-26">[26]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2010</td>
                            <td colspan="1" rowspan="1">VS_ML_7</td>
                            <td colspan="1" rowspan="1">17 target-directed compound sets; each set contains a minimum of 10 distinct scaffolds and each
                                <break/>scaffold represents 5 compounds</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">20
                                <sup>
                                    <xref ref-type="bibr" rid="ref-27">[27]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_VZ</td>
                            <td colspan="1" rowspan="1">10,489 malaria screening hits</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">21
                                <sup>
                                    <xref ref-type="bibr" rid="ref-28">[28]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_2</td>
                            <td colspan="1" rowspan="1">458 target-based sets with scaffolds and scaffold hierarchies</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">22
                                <sup>
                                    <xref ref-type="bibr" rid="ref-29">[29]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_VZ</td>
                            <td colspan="1" rowspan="1">4 sets of compounds active against 3 or 4 targets</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">23
                                <sup>
                                    <xref ref-type="bibr" rid="ref-30">[30]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_VZ</td>
                            <td colspan="1" rowspan="1">881 factor Xa inhibitors</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">24
                                <sup>
                                    <xref ref-type="bibr" rid="ref-31">[31]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">VS_ML_8</td>
                            <td colspan="1" rowspan="1">50 AC prioritized for similarity searching</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">25
                                <sup>
                                    <xref ref-type="bibr" rid="ref-32">[32]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">VS_ML_9</td>
                            <td colspan="1" rowspan="1">25 data sets from successful ligand-based virtual screening applications</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">26
                                <sup>
                                    <xref ref-type="bibr" rid="ref-33">[33]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_3</td>
                            <td colspan="1" rowspan="1">26 conserved scaffolds in activity profile sequences of length 4</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">27
                                <sup>
                                    <xref ref-type="bibr" rid="ref-34">[34]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">PROG_8</td>
                            <td colspan="1" rowspan="1">Scaffold distance function</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">28
                                <sup>
                                    <xref ref-type="bibr" rid="ref-35">[35]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2011</td>
                            <td colspan="1" rowspan="1">SAR_4</td>
                            <td colspan="1" rowspan="1">2 sets of compounds with multiple K
                                <sub>i</sub> or IC
                                <sub>50</sub> measurements against the same targets that differed within
                                <break/>1 order of magnitude</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">29
                                <sup>
                                    <xref ref-type="bibr" rid="ref-36">[36]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2012</td>
                            <td colspan="1" rowspan="1">SAR_VZ</td>
                            <td colspan="1" rowspan="1">4 AC</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">30
                                <sup>
                                    <xref ref-type="bibr" rid="ref-37">[37]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2012</td>
                            <td colspan="1" rowspan="1">SAR_5</td>
                            <td colspan="1" rowspan="1">5 sets of different types of activity cliffs</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">31
                                <sup>
                                    <xref ref-type="bibr" rid="ref-38">[38]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2012</td>
                            <td colspan="1" rowspan="1">VS_ML_10</td>
                            <td colspan="1" rowspan="1">50 AC for scaffold hopping analysis</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">32
                                <sup>
                                    <xref ref-type="bibr" rid="ref-39">[39]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2012</td>
                            <td colspan="1" rowspan="1">SAR_6</td>
                            <td colspan="1" rowspan="1">61 AC consisting of SAR transfer series with regular potency progression</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">33
                                <sup>
                                    <xref ref-type="bibr" rid="ref-40">[40]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2013</td>
                            <td colspan="1" rowspan="1">SAR_7</td>
                            <td colspan="1" rowspan="1">4 activity measurement type-dependent sets of scaffolds</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">34
                                <sup>
                                    <xref ref-type="bibr" rid="ref-41">[41]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2013</td>
                            <td colspan="1" rowspan="1">VS_ML_11</td>
                            <td colspan="1" rowspan="1">2 multi-target compound sets</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">35
                                <sup>
                                    <xref ref-type="bibr" rid="ref-42">[42]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2013</td>
                            <td colspan="1" rowspan="1">VS_ML_12</td>
                            <td colspan="1" rowspan="1">4 multi-target compound sets and 3 multi-mechanism sets</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">36
                                <sup>
                                    <xref ref-type="bibr" rid="ref-43">[43]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2013</td>
                            <td colspan="1" rowspan="1">SAR_8</td>
                            <td colspan="1" rowspan="1">2337 compound series matrices</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">37
                                <sup>
                                    <xref ref-type="bibr" rid="ref-44">[44]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2013</td>
                            <td colspan="1" rowspan="1">SAR_9</td>
                            <td colspan="1" rowspan="1">128 AC containing &#x2265;100 compounds with K
                                <sub>i</sub> values</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">38
                                <sup>
                                    <xref ref-type="bibr" rid="ref-45">[45]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_10</td>
                            <td colspan="1" rowspan="1">30,452 and 45,607 target-based MMS with K
                                <sub>i</sub> and IC
                                <sub>50</sub> values, respectively</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">39
                                <sup>
                                    <xref ref-type="bibr" rid="ref-46">[46]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_11</td>
                            <td colspan="1" rowspan="1">221 drug-unique scaffolds</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">40
                                <sup>
                                    <xref ref-type="bibr" rid="ref-47">[47]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_12</td>
                            <td colspan="1" rowspan="1">92,734 MMPs based upon retrosynthetic rules for 435 AC</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">41
                                <sup>
                                    <xref ref-type="bibr" rid="ref-8">[8]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_13</td>
                            <td colspan="1" rowspan="1">20,073 and 25,297 MMP-based activity cliffs with K
                                <sub>i</sub> and IC
                                <sub>50</sub> values, respectively</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">42
                                <sup>
                                    <xref ref-type="bibr" rid="ref-8">[8]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_14</td>
                            <td colspan="1" rowspan="1">4 activity measurement type-dependent sets of SAR transfer series with approximate or regular
                                <break/>potency progression</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">43
                                <sup>
                                    <xref ref-type="bibr" rid="ref-8">[8]</xref>
                                </sup>
                            </td>
                            <td colspan="1" rowspan="1">2014</td>
                            <td colspan="1" rowspan="1">SAR_15</td>
                            <td colspan="1" rowspan="1">169,889 and 240,322 transformation size-restricted MMPs based upon retrosynthetic rules with K
                                <sub>i</sub> and
                                <break/>IC
                                <sub>50</sub> values, respectively</td>
                        </tr>
                    </tbody>
                </table>
                <table-wrap-foot>
                    <fn>
                        <p>Data entries are organized according to scientific subject areas: structure-activity relationship (SAR) and structure-selectivity relationship (SSR) analysis, SAR visualization (SAR_VZ), virtual screening 
                            <italic toggle="yes">via</italic> similarity searching or machine learning (VS_ML), and programs (PROG). References in the Entry column provide the original publication introducing the program and/or data set. Program entries are described in more detail in Table 2 of our original data article
                            <sup>
                                <xref ref-type="bibr" rid="ref-1">1</xref>
                            </sup>. The new compound data sets 31&#x2013;43 are discussed in the text. Programs and data sets reported herein have been separately deposited in ZENODO for access and download.</p>
                    </fn>
                </table-wrap-foot>
            </table-wrap>
            <sec>
                <title>Entry 31</title>
                <p>50 compound activity classes (AC) are prioritized for the evaluation of scaffold hopping potential in ligand-based virtual screening
                    <sup>
                        <xref ref-type="bibr" rid="ref-38">38</xref>
                    </sup>. These AC contain the largest proportion of scaffold pairs with largest chemical inter-scaffold distances
                    <sup>
                        <xref ref-type="bibr" rid="ref-38">38</xref>
                    </sup> that can be derived from current bioactive compounds and hence present challenging test cases for scaffold hopping analysis.</p>
            </sec>
            <sec>
                <title>Entry 32</title>
                <p>596 SAR transfer series with regular potency progression (SAR-TS-RP) are extracted from 61 AC
                    <sup>
                        <xref ref-type="bibr" rid="ref-39">39</xref>
                    </sup>. Each SAR-TS-RP represents two compound series with different core structures and pairwise corresponding substitutions that yield comparable potency progression against a given target. These series provide a knowledge base for the analysis and prediction of SAR transfer events.</p>
            </sec>
            <sec>
                <title>Entry 33</title>
                <p>Four sets of molecular scaffolds (with each scaffold representing more than ten compounds) are provided that are active against a single target (ST), multiple targets from the same family (SF), or multiple targets from different families (MF)
                    <sup>
                        <xref ref-type="bibr" rid="ref-40">40</xref>
                    </sup>. Data sets are separately assembled for different types of potency measurements (
                    <italic toggle="yes">i.e</italic>., K
                    <sub>i</sub> and IC
                    <sub>50</sub> values) and provide a resource of scaffolds representing compounds with varying degrees of target promiscuity.</p>
            </sec>
            <sec>
                <title>Entry 34</title>
                <p>Two multi-target compound data sets consist of confirmed screening hits
                    <sup>
                        <xref ref-type="bibr" rid="ref-41">41</xref>
                    </sup>. Each set contains compounds with single-, dual-, and triple-target activity, or no activity. These data provide test cases for machine learning or other approaches to differentiate between compounds with overlapping yet distinct activity profiles.</p>
            </sec>
            <sec>
                <title>Entry 35</title>
                <p>Four multi-target compound data sets are provided
                    <sup>
                        <xref ref-type="bibr" rid="ref-42">42</xref>
                    </sup>. Each set contains compounds tested in three different assays. Compounds are organized into eight different subsets according to their activity profiles, 
                    <italic toggle="yes">i.e.</italic>, single-, dual-, and triple-target activity, or no activity. In addition, three multi-mechanism compound sets are designed
                    <sup>
                        <xref ref-type="bibr" rid="ref-42">42</xref>
                    </sup>. In the latter case, compounds are organized into four subsets according to their mechanism-of-action. These data sets also represent test cases for machine learning to distinguish compounds with different activity profiles or mechanisms.</p>
            </sec>
            <sec>
                <title>Entry 36</title>
                <p>2337 non-redundant compound series matrices (CSMs) are generated covering compounds active against a wide spectrum of targets
                    <sup>
                        <xref ref-type="bibr" rid="ref-43">43</xref>
                    </sup>. Each matrix contains at least two analogous matching molecular series (MMS) with structurally related yet distinct cores. A matrix consists of known active compounds and structurally related virtual compounds and hence provides suggestions for compound design.</p>
            </sec>
            <sec>
                <title>Entry 37</title>
                <p>128 target-based data sets are assembled that consist of at least 100 compounds with precisely specified equilibrium constants (K
                    <sub>i</sub> values) below 1 &#x00b5;M for human targets
                    <sup>
                        <xref ref-type="bibr" rid="ref-44">44</xref>
                    </sup>. These high-confidence activity data sets provide a sound basis for SAR exploration.</p>
            </sec>
            <sec>
                <title>Entry 38</title>
                <p>30,452 and 45,607 target-based MMS with K
                    <sub>i</sub> and IC
                    <sub>50</sub> values, respectively, are extracted from bioactive compounds
                    <sup>
                        <xref ref-type="bibr" rid="ref-45">45</xref>
                    </sup>.</p>
            </sec>
            <sec>
                <title>Entry 39</title>
                <p>221 scaffolds are identified that only occur in approved drugs but are not found in currently available bioactive compounds
                    <sup>
                        <xref ref-type="bibr" rid="ref-46">46</xref>
                    </sup>. Accordingly, these scaffolds have been termed drug-unique scaffolds.</p>
            </sec>
            <sec>
                <title>Entry 40</title>
                <p>92,734 MMPs are generated from 435 AC on a basis of retrosynthetic rules
                    <sup>
                        <xref ref-type="bibr" rid="ref-47">47</xref>
                    </sup>. These MMPs consider chemical reaction information and should be useful for practical medicinal chemistry applications.</p>
            </sec>
            <sec>
                <title>Entry 41</title>
                <p>20,073 and 25,297 MMP-based activity cliffs (
                    <italic toggle="yes">i.e.</italic> pairs of structurally analogous compounds with an at least 100-fold difference in potency) are extracted from specifically active compounds based upon K
                    <sub>i</sub> and IC
                    <sub>50</sub> values, respectively
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>. The MMP-based activity cliffs provide a large knowledge base for SAR analysis.</p>
            </sec>
            <sec>
                <title>Entry 42</title>
                <p>157 and 513 MMP-based SAR transfer series with approximate potency progression plus 60 and 322 SAR transfer series with regular potency progression based upon K
                    <sub>i</sub> and IC
                    <sub>50</sub> values, respectively, are isolated from bioactive compounds. These transfer series are active against individual targets
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>. Similar to MMP-based activity cliffs, SAR transfer series provide a resource for SAR analysis and compound design.</p>
            </sec>
            <sec>
                <title>Entry 43</title>
                <p>169,889 and 240,322 transformation size-restricted MMPs based upon retrosynthetic rules with K
                    <sub>i</sub> and IC
                    <sub>50</sub> values, respectively, are systematically extracted from available AC
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>. Different from the retrosynthetic rule-based MMPs presented above, applied transformation size-restrictions ensure that chemical changes distinguishing compounds in pairs are small.</p>
            </sec>
        </sec>
        <sec>
            <title>Summary</title>
            <p>Herein we have provided an updated release of data sets and programs for chemoinformatics and medicinal chemistry that we make freely available. In total, 13 new data sets are introduced. Transferring all data entries in an organized form to the ZENODO platform makes them easily accessible. We hope that our current release might be of interest and helpful to many investigators in academia and the pharmaceutical industry.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <p>The data referenced by this article are under copyright with the following copyright statement: Copyright: &#x00ef;&#x00bf;&#x00bd; 2014 Hu Y and Bajorath J</p>
            <p>Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
                <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/publicdomain/zero/1.0/"/>
            </p>
            <p>ZENODO: Programs for chemoinformatics and computational medicinal chemistry, doi: 
                <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.8451">10.5281/zenodo.8451</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-48">48</xref>
                </sup>.</p>
            <p>ZENODO: Data sets for chemoinformatics and computational medicinal chemistry, doi: 
                <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.8455">10.5281/zenodo.8455</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-49">49</xref>
                </sup>.</p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgments</title>
            <p>We are grateful to current and former members of our research group who have contributed to the development of the data sets and programs reported herein.</p>
        </ack>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Freely available compound data sets and software tools for chemoinformatics and computational medicinal chemistry applications [v1; ref status: indexed, 
                        <ext-link ext-link-type="uri" xlink:href="http://f1000r.es/Mu9krs">http://f1000r.es/Mu9krs</ext-link>].</article-title>
                    <source>
						
                        <italic toggle="yes">F1000Res.</italic>
					</source>
                    <year>2012</year>;<volume>1</volume>:<fpage>11</fpage>.
                    <pub-id pub-id-type="pmid">24358818</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.1-11.v1</pub-id>
                    <pub-id pub-id-type="pmcid">3782340</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Gaulton</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bellis</surname>
                            <given-names>LJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bento</surname>
                            <given-names>AP</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>ChEMBL: a large-scale bioactivity database for drug discovery.</article-title>
                    <source>
						
                        <italic toggle="yes">Nucleic Acids Res.</italic>
					</source>
                    <year>2012</year>;<volume>40</volume>(<issue>Database issue</issue>):<fpage>D1100</fpage>&#x2013;<lpage>D1107</lpage>.
                    <pub-id pub-id-type="pmid">21948594</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkr777</pub-id>
                    <pub-id pub-id-type="pmcid">3245175</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Bento</surname>
                            <given-names>AP</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Gaulton</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hersey</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>The ChEMBL bioactivity database: an update.</article-title>
                    <source>
						
                        <italic toggle="yes">Nucleic Acids Res.</italic>
					</source>
                    <year>2014</year>;<volume>42</volume>(<issue>Database issue</issue>):<fpage>D1083</fpage>&#x2013;<lpage>D1090</lpage>.
                    <pub-id pub-id-type="pmid">24214965</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkt1031</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>T</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Lin</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wen</surname>
                            <given-names>X</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>BindingDB: a web-accessible database of experimentally determined protein&#x2013;ligand binding affinities.</article-title>
                    <source>
						
                        <italic toggle="yes">Nucleic Acids Res.</italic>
					</source>
                    <year>2007</year>;<volume>35</volume>(<issue>Database issue</issue>):<fpage>D198</fpage>&#x2013;<lpage>D201</lpage>.
                    <pub-id pub-id-type="pmid">17145705</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkl999</pub-id>
                    <pub-id pub-id-type="pmcid">1751547</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Xiao</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Suzek</surname>
                            <given-names>TO</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>PubChem: a public information system for analyzing bioactivities of small molecules.</article-title>
                    <source>
						
                        <italic toggle="yes">Nucleic Acids Res.</italic>
					</source>
                    <year>2009</year>;<volume>37</volume>(<issue>Web Server issue</issue>):<fpage>W623</fpage>&#x2013;<lpage>W633</lpage>.
                    <pub-id pub-id-type="pmid">19498078</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkp456</pub-id>
                    <pub-id pub-id-type="pmcid">2703903</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Weininger</surname>
                            <given-names>D</given-names>
                        </name>
					</person-group>:
                    <article-title>SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Comput Sci.</italic>
					</source>
                    <year>1988</year>;<volume>28</volume>(<issue>1</issue>):<fpage>31</fpage>&#x2013;<lpage>36</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ci00057a005</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Dalby</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Nourse</surname>
                            <given-names>JG</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hounshell</surname>
                            <given-names>WD</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Comput Sci.</italic>
					</source>
                    <year>1992</year>;<volume>32</volume>(<issue>3</issue>):<fpage>244</fpage>&#x2013;<lpage>255</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ci00007a012</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>de la Vega de Le&#x00f3;n</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Matched molecular pair-based data sets for computer-aided medicinal chemistry [v2; ref status: indexed, 
                        <ext-link ext-link-type="uri" xlink:href="http://f1000r.es/309">http://f1000r.es/309</ext-link>].</article-title>
                    <source>
						
                        <italic toggle="yes">F1000Res.</italic>
					</source>
                    <year>2014</year>;<volume>3</volume>:<fpage>36</fpage>.
                    <pub-id pub-id-type="doi">10.12688/f1000research.3-36.v2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Tovar</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Eckert</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Comparison of 2D fingerprint methods for multiple-template similarity searching on compound activity classes of increasing structural diversity.</article-title>
                    <source>
						
                        <italic toggle="yes">ChemMedChem.</italic>
					</source>
                    <year>2007</year>;<volume>2</volume>(<issue>2</issue>):<fpage>208</fpage>&#x2013;<lpage>217</lpage>.
                    <pub-id pub-id-type="pmid">17143917</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cmdc.200600225</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Godden</surname>
                            <given-names>JW</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>A novel descriptor histogram filtering method for database mining and the identification of active molecules.</article-title>
                    <source>
						
                        <italic toggle="yes">Lett Drug Design Discov.</italic>
					</source>
                    <year>2007</year>;<volume>4</volume>(<issue>4</issue>):<fpage>286</fpage>&#x2013;<lpage>292</lpage>.
                    <pub-id pub-id-type="doi">10.2174/157018007784619970</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Ahmed</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Vogt</surname>
                            <given-names>I</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Methods for computer-aided chemical biology. Part 1: Design of a benchmark system for the evaluation of compound selectivity.</article-title>
                    <source>
						
                        <italic toggle="yes">Chem Biol Drug Des.</italic>
					</source>
                    <year>2007</year>;<volume>70</volume>(<issue>3</issue>):<fpage>182</fpage>&#x2013;<lpage>194</lpage>.
                    <pub-id pub-id-type="pmid">17718713</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1747-0285.2007.00554.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Vogt</surname>
                            <given-names>I</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Ahmed</surname>
                            <given-names>HE</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Auer</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Exploring structure-selectivity relationships of biogenic amine GPCR antagonists using similarity searching and dynamic compound mapping.</article-title>
                    <source>
						
                        <italic toggle="yes">Mol Divers.</italic>
					</source>
                    <year>2008</year>;<volume>12</volume>(<issue>1</issue>):<fpage>25</fpage>&#x2013;<lpage>40</lpage>.
                    <pub-id pub-id-type="pmid">18317941</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s11030-008-9071-2</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Geppert</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Methods for computer-aided chemical biology. Part 3: analysis of structure-selectivity relationships through single- or dual-step selectivity searching and Bayesian classification.</article-title>
                    <source>
						
                        <italic toggle="yes">Chem Biol Drug Des.</italic>
					</source>
                    <year>2008</year>;<volume>71</volume>(<issue>6</issue>):<fpage>518</fpage>&#x2013;<lpage>528</lpage>.
                    <pub-id pub-id-type="pmid">18482335</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1747-0285.2008.00670.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Geppert</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Random reduction in fingerprint bit density improves compound recall in search calculations using complex reference molecules.</article-title>
                    <source>
						
                        <italic toggle="yes">Chem Biol Drug Des.</italic>
					</source>
                    <year>2008</year>;<volume>71</volume>(<issue>6</issue>):<fpage>511</fpage>&#x2013;<lpage>517</lpage>.
                    <pub-id pub-id-type="pmid">18466274</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1747-0285.2008.00664.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Nisius</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>G&#x00f6;ller</surname>
                            <given-names>AH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Combining cluster analysis, feature selection and multiple support vector machine models for the identification of human ether-a-go-go related gene channel blocking compounds.</article-title>
                    <source>
						
                        <italic toggle="yes">Chem Biol Drug Des.</italic>
					</source>
                    <year>2009</year>;<volume>73</volume>(<issue>1</issue>):<fpage>17</fpage>&#x2013;<lpage>25</lpage>.
                    <pub-id pub-id-type="pmid">19152631</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1747-0285.2008.00747.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Ahmed</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Geppert</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Methods for computer-aided chemical biology. Part 4: selectivity searching for ion channel ligands and mapping of molecular fragments as selectivity markers.</article-title>
                    <source>
						
                        <italic toggle="yes">Chem Biol Drug Des.</italic>
					</source>
                    <year>2009</year>;<volume>73</volume>(<issue>3</issue>):<fpage>273</fpage>&#x2013;<lpage>282</lpage>.
                    <pub-id pub-id-type="pmid">19207462</pub-id>
                    <pub-id pub-id-type="doi">10.1111/j.1747-0285.2009.00784.x</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Peltason</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Weskamp</surname>
                            <given-names>N</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Teckentrup</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Exploration of structure-activity relationship determinants in analogue series.</article-title>
                    <source>
						
                        <italic toggle="yes">J Med Chem.</italic>
					</source>
                    <year>2009</year>;<volume>52</volume>(<issue>10</issue>):<fpage>3212</fpage>&#x2013;<lpage>3224</lpage>.
                    <pub-id pub-id-type="pmid">19397320</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm900107b</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Nisius</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Molecular fingerprint recombination: generating hybrid fingerprints for similarity searching from different fingerprint types.</article-title>
                    <source>
						
                        <italic toggle="yes">ChemMedChem.</italic>
					</source>
                    <year>2009</year>;<volume>4</volume>(<issue>11</issue>):<fpage>1859</fpage>&#x2013;<lpage>1863</lpage>.
                    <pub-id pub-id-type="pmid">19714705</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cmdc.200900243</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Batista</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Tan</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Atom-centered interacting fragments and similarity search applications.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2010</year>;<volume>50</volume>(<issue>1</issue>):<fpage>79</fpage>&#x2013;<lpage>86</lpage>.
                    <pub-id pub-id-type="pmid">20030298</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci9004223</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Exploring target-selectivity patterns of molecular scaffolds.</article-title>
                    <source>
						
                        <italic toggle="yes">ACS Med Chem Lett.</italic>
					</source>
                    <year>2010</year>;<volume>1</volume>(<issue>2</issue>):<fpage>54</fpage>&#x2013;<lpage>58</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ml900024v</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Peltason</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Computational analysis of multi-target structure-activity relationships to derive preference orders for chemical modifications toward target selectivity.</article-title>
                    <source>
						
                        <italic toggle="yes">ChemMedChem.</italic>
					</source>
                    <year>2010</year>;<volume>5</volume>(<issue>6</issue>):<fpage>847</fpage>&#x2013;<lpage>858</lpage>.
                    <pub-id pub-id-type="pmid">20414918</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cmdc.201000064</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Lounkine</surname>
                            <given-names>E</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wawer</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>SARANEA: a freely available program to mine structure-activity and structure-selectivity relationship information in compound data sets.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2010</year>;<volume>50</volume>(<issue>1</issue>):<fpage>68</fpage>&#x2013;<lpage>78</lpage>.
                    <pub-id pub-id-type="pmid">20053000</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci900416a</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Peltason</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Iyer</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Rationalizing three-dimensional activity landscapes and the influence of molecular representations on landscape topology and the formation of activity cliffs.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2010</year>;<volume>50</volume>(<issue>6</issue>):<fpage>1021</fpage>&#x2013;<lpage>1033</lpage>.
                    <pub-id pub-id-type="pmid">20443603</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci100091e</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Chemical substitutions that introduce activity cliffs across different compound classes and biological targets.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2010</year>;<volume>50</volume>(<issue>7</issue>):<fpage>1248</fpage>&#x2013;<lpage>1256</lpage>.
                    <pub-id pub-id-type="pmid">20608746</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci1001845</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wawer</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Similarity-potency trees: a method to search for SAR information in compound data sets and derive SAR rules.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2010</year>;<volume>50</volume>(<issue>8</issue>):<fpage>1395</fpage>&#x2013;<lpage>1409</lpage>.
                    <pub-id pub-id-type="pmid">20726598</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci100197b</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Vogt</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Geppert</surname>
                            <given-names>H</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Scaffold hopping using two-dimensional fingerprints: true potential, black magic, or a hopeless endeavor? Guidelines for virtual screening.</article-title>
                    <source>
						
                        <italic toggle="yes">J Med Chem.</italic>
					</source>
                    <year>2010</year>;<volume>53</volume>(<issue>15</issue>):<fpage>5707</fpage>&#x2013;<lpage>5715</lpage>.
                    <pub-id pub-id-type="pmid">20684607</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm100492z</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wawer</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Extracting SAR information from a large collection of anti-malarial screening hits by NSG-SPT analysis.</article-title>
                    <source>
						
                        <italic toggle="yes">ACS Med Chem Lett.</italic>
					</source>
                    <year>2011</year>;<volume>2</volume>(<issue>3</issue>):<fpage>201</fpage>&#x2013;<lpage>206</lpage>.
                    <pub-id pub-id-type="doi">10.1021/ml100240z</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Combining horizontal and vertical substructure relationships in scaffold hierarchies for activity prediction.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>2</issue>):<fpage>248</fpage>&#x2013;<lpage>257</lpage>.
                    <pub-id pub-id-type="pmid">21271729</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci100448a</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Dimova</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wawer</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Design of multitarget activity landscapes that capture hierarchical activity cliff distributions.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>2</issue>):<fpage>258</fpage>&#x2013;<lpage>266</lpage>.
                    <pub-id pub-id-type="pmid">21275393</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci100477m</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Wawer</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Local structural changes, global data views: graphical substructure-activity relationship trailing.</article-title>
                    <source>
						
                        <italic toggle="yes">J Med Chem.</italic>
					</source>
                    <year>2011</year>;<volume>54</volume>(<issue>8</issue>):<fpage>2944</fpage>&#x2013;<lpage>2951</lpage>.
                    <pub-id pub-id-type="pmid">21443196</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm200026b</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Heikamp</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Large-scale similarity search profiling of ChEMBL compound data sets.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>8</issue>):<fpage>1831</fpage>&#x2013;<lpage>1839</lpage>.
                    <pub-id pub-id-type="pmid">21728295</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci200199u</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Ripphausen</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>REPROVIS-DB: a benchmark system for ligand-based virtual screening derived from reproducible prospective applications.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>10</issue>):<fpage>2467</fpage>&#x2013;<lpage>2473</lpage>.
                    <pub-id pub-id-type="pmid">21902278</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci200309j</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <label>33</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Activity profile sequences: a concept to account for the progression of compound activity in target space and to extract SAR information from analogue series with multiple target annotations.</article-title>
                    <source>
						
                        <italic toggle="yes">ChemMedChem.</italic>
					</source>
                    <year>2011</year>;<volume>6</volume>(<issue>12</issue>):<fpage>2150</fpage>&#x2013;<lpage>2154</lpage>.
                    <pub-id pub-id-type="pmid">22052747</pub-id>
                    <pub-id pub-id-type="doi">10.1002/cmdc.201100395</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <label>34</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Vogt</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Development of a method to consistently quantify the structural distance between scaffolds and to assess scaffold hopping potential.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>10</issue>):<fpage>2507</fpage>&#x2013;<lpage>2514</lpage>.
                    <pub-id pub-id-type="pmid">21955025</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci2003945</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <label>35</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Assessing the confidence level of public domain compound activity data and the impact of alternative potency measurements on SAR analysis.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2011</year>;<volume>51</volume>(<issue>12</issue>):<fpage>3131</fpage>&#x2013;<lpage>3137</lpage>.
                    <pub-id pub-id-type="pmid">22059677</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci2004434</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <label>36</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Gupta-Ostermann</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Introducing the LASSO graph for compound data set representation and structure-activity relationship analysis.</article-title>
                    <source>
						
                        <italic toggle="yes">J Med Chem.</italic>
					</source>
                    <year>2012</year>;<volume>55</volume>(<issue>11</issue>):<fpage>5546</fpage>&#x2013;<lpage>5553</lpage>.
                    <pub-id pub-id-type="pmid">22571406</pub-id>
                    <pub-id pub-id-type="doi">10.1021/jm3004762</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <label>37</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Extending the activity cliff concept: structural categorization of activity cliffs and systematic identification of different types of cliffs in the ChEMBL database.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2012</year>;<volume>52</volume>(<issue>7</issue>):<fpage>1806</fpage>&#x2013;<lpage>1811</lpage>.
                    <pub-id pub-id-type="pmid">22758389</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci300274c</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <label>38</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Systematic assessment of scaffold distances in ChEMBL: prioritization of compound data sets for scaffold hopping analysis in virtual screening.</article-title>
                    <source>
						
                        <italic toggle="yes">J Comput Aided Mol Des.</italic>
					</source>
                    <year>2012</year>;<volume>26</volume>(<issue>10</issue>):<fpage>1101</fpage>&#x2013;<lpage>1109</lpage>.
                    <pub-id pub-id-type="pmid">22972561</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s10822-012-9603-9</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <label>39</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wassermann</surname>
                            <given-names>AM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Vogt</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Systematic assessment of compound series with SAR transfer potential.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2012</year>;<volume>52</volume>(<issue>12</issue>):<fpage>3138</fpage>&#x2013;<lpage>3143</lpage>.
                    <pub-id pub-id-type="pmid">23186159</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci300481d</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <label>40</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Systematic identification of scaffolds representing compounds active against individual targets and single or multiple target families.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2013</year>;<volume>53</volume>(<issue>2</issue>):<fpage>312</fpage>&#x2013;<lpage>326</lpage>.
                    <pub-id pub-id-type="pmid">23339619</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci300616s</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-41">
                <label>41</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Heikamp</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Prediction of compounds with closely related activity profiles using weighted support vector machine linear combinations.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2013</year>;<volume>53</volume>(<issue>4</issue>):<fpage>791</fpage>&#x2013;<lpage>801</lpage>.
                    <pub-id pub-id-type="pmid">23517241</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci400090t</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-42">
                <label>42</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Namasivayam</surname>
                            <given-names>V</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Balfer</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Classification of compounds with distinct or overlapping multi-target activities and diverse molecular mechanisms using emerging chemical patterns.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2013</year>;<volume>53</volume>(<issue>6</issue>):<fpage>1272</fpage>&#x2013;<lpage>1281</lpage>.
                    <pub-id pub-id-type="pmid">23692475</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci400186n</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-43">
                <label>43</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Gupta-Ostermann</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Systematic mining of analog series with related core structures in multi-target activity space.</article-title>
                    <source>
						
                        <italic toggle="yes">J Comput Aided Mol Des.</italic>
					</source>
                    <year>2013</year>;<volume>27</volume>(<issue>8</issue>):<fpage>665</fpage>&#x2013;<lpage>674</lpage>.
                    <pub-id pub-id-type="pmid">23975272</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s10822-013-9671-5</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-44">
                <label>44</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Dimova</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Stumpfe</surname>
                            <given-names>D</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Quantifying the fingerprint descriptor dependence of structure-activity relationship information on a large scale.</article-title>
                    <source>
						
                        <italic toggle="yes">J Chem Inf Model.</italic>
					</source>
                    <year>2013</year>;<volume>53</volume>(<issue>9</issue>):<fpage>2275</fpage>&#x2013;<lpage>2281</lpage>.
                    <pub-id pub-id-type="pmid">23968259</pub-id>
                    <pub-id pub-id-type="doi">10.1021/ci4004078</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-45">
                <label>45</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>de la Vega de Le&#x00f3;n</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Systematic identification of matching molecular series and mapping of screening hits.</article-title>
                    <source>
						
                        <italic toggle="yes">Mol Inf.</italic>
					</source>
                    <year>2014</year>;
                    <italic toggle="yes">In press</italic>.</mixed-citation>
            </ref>
            <ref id="ref-46">
                <label>46</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Many drugs contain unique scaffolds with varying structural relationships to scaffolds of currently available bioactive compounds.</article-title>
                    <source>
						
                        <italic toggle="yes">Eur J Med Chem.</italic>
					</source>
                    <year>2014</year>;<volume>76</volume>:<fpage>427</fpage>&#x2013;<lpage>434</lpage>.
                    <pub-id pub-id-type="doi">10.1016/j.ejmech.2014.02.040</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-47">
                <label>47</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>de la Vega de Le&#x00f3;n</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Matched molecular pairs derived by retrosynthetic fragmentation.</article-title>
                    <source>
						
                        <italic toggle="yes">Med Chem Commun.</italic>
					</source>
                    <year>2014</year>;<volume>5</volume>(<issue>1</issue>):<fpage>64</fpage>&#x2013;<lpage>67</lpage>.
                    <pub-id pub-id-type="doi">10.1039/C3MD00259D</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-48">
                <label>48</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Programs for chemoinformatics and computational medicinal chemistry</article-title>.<year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.8451">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-49">
                <label>49</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hu</surname>
                            <given-names>Y</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Bajorath</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>Data sets for chemoinformatics and computational medicinal chemistry</article-title>.<year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.8455">Data Source</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report4077">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.3979.r4077</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Walters</surname>
                        <given-names>Patrick</given-names>
                    </name>
                    <xref ref-type="aff" rid="r4077a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2860-7958</uri>
                </contrib>
                <aff id="r4077a1">
                    <label>1</label>Vertex Pharmaceuticals Incorporated, Cambridge, MA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>4</month>
                <year>2014</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2014 Walters P</copyright-statement>
                <copyright-year>2014</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport4077" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.3713.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The ability to compare multiple computational methods across a series of consistent, high-quality datasets is critical to the progress of computational chemistry and cheminformatics. In the past, each paper published in the field seemed to present yet another new dataset. This dataset heterogeneity made it difficult, if not impossible, to objectively compare methods, and impeded the progress of the field. The availability of large repositories of carefully curated data is critical to the progress of the field. The datasets described in this paper will provide an invaluable resource for future studies. It is refreshing to see the emergence of platforms like ZENODO dedicated to hosting this data.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report4409">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.3979.r4409</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Swain</surname>
                        <given-names>Chris J.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r4409a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r4409a1">
                    <label>1</label>Cambridge Med Chem Consulting, Cambridge, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>17</day>
                <month>4</month>
                <year>2014</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2014 Swain CJ</copyright-statement>
                <copyright-year>2014</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport4409" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.3713.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Building and testing novel computer models requires access to suitable datasets. The authors have compiled a very useful set of interesting datasets and made them readily available in standard formats (SMILES and SDF). This allows others to both test existing algorithms and to develop new ones.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report4079">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.3979.r4079</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Jain</surname>
                        <given-names>Ajay</given-names>
                    </name>
                    <xref ref-type="aff" rid="r4079a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r4079a1">
                    <label>1</label>HDF Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>13</day>
                <month>3</month>
                <year>2014</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2014 Jain A</copyright-statement>
                <copyright-year>2014</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport4079" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.3713.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Hu and Bajorath offer an update to their resource for computational chemistry. The curated data, and its engineered availability, will be of great interest, especially to methods developers. Even those researchers that are interested in exploring larger data sets that illuminate issues such as activity cliffs and small-molecule structural motifs will find the resource of interest.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
