<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.16216.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Feature selection with the R package 
                    <italic>MXM</italic>
                </article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Tsagris</surname>
                        <given-names>Michail</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Tsamardinos</surname>
                        <given-names>Ioannis</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-2492-959X</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Department of Computer Science, University of Crete, Heraklion, Crete, 70013, Greece</aff>
                <aff id="a2">
                    <label>2</label>Institute of Applied and Computational Mathematics, Foundation of Research and Technology Hellas, Heraklion, Crete, 70013, Greece</aff>
                <aff id="a3">
                    <label>3</label>Gnosis Data Analysis (PC), Heraklion, Crete, 71305, Greece</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:mtsagris@uoc.gr">mtsagris@uoc.gr</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>20</day>
                <month>9</month>
                <year>2018</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2018</year>
            </pub-date>
            <volume>7</volume>
            <elocation-id>1505</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>11</day>
                    <month>9</month>
                    <year>2018</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Tsagris M and Tsamardinos I</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/7-1505/pdf"/>
            <abstract>
                <p>Feature (or variable) selection is the process of identifying the minimal set of features with the highest predictive performance on the target variable of interest. Numerous feature selection algorithms have been developed over the years, but only few have been implemented in R as a package. The R package 
                    <italic toggle="yes">MXM</italic> is such an example, which not only offers a variety of feature selection algorithms, but has unique features that make it advantageous over its competitors: a) it contains feature selection algorithms that can treat numerous types of target variables, including continuous, percentages, time to event (survival), binary, nominal, ordinal, clustered, counts, left censored, etc; b) it contains a variety of regression models to plug into the feature selection algorithms; c) it includes an algorithm for detecting multiple solutions (many sets of equivalent features); and d) it includes memory efficient algorithms for high volume data, data that cannot be loaded into R. In this paper we qualitatively compare 
                    <italic toggle="yes">MXM</italic> with other relevant packages and discuss its advantages and disadvantages. We also provide a demonstration of its algorithms using real high-dimensional data from various applications.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Feature selection</kwd>
                <kwd>algorithms</kwd>
                <kwd>R package</kwd>
                <kwd>computational efficiency</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100011102">
                    <funding-source>Seventh Framework Programme</funding-source>
                    <award-id>617393</award-id>
                </award-group>
                <funding-statement>The research leading to these results has received funding from the European Research Council under the European Union&#x2019;s Seventh Framework Programme (FP/2007-2013) / ERC Grant Agreement No. 617393.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Given a target (response or dependent) variable 
                <italic toggle="yes">Y</italic> of 
                <italic toggle="yes">n</italic> measurements and a set 
                <bold>X</bold> of 
                <italic toggle="yes">p</italic> features (predictor or independent variables) the problem of feature (or variable) selection (FS) is to identify the minimal set of features with the highest predictability on the target variable (outcome) of interest. Why should researchers and practitioners perform FS? For a variety of reasons
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>: a) many features may be expensive (and/or unnecessary) to measure, especially in the clinical and medical domains; b) FS may result in more accurate models (of higher predictability) by removing noise while treating the curse-of-dimensionality; c) the produced parsimonious models are computationally cheaper and easier to understand and interpret; d) future experiments can benefit from prior feature selection tasks and provide more insight into the problem of interest, its characteristics and structure.</p>
            <p>R contains thousands of packages, but only a small portion of them are dedicated to the task of FS, yet offering limited or narrow capabilities. For example, packages that accept few or specific types of target variables (e.g. binary and multi-class only). This leaves many types of target variables, for example percentages, left censored, positive valued, matched case-control data, etc., untreated. The availability of regression models for some types of data is rather small. Count data is such an example, for which Poisson regression is the only model considered in nearly all R packages. Most algorithms including statistical tests offer limited statistical tests, e.g. likelihood ratio test only. Almost all available FS algorithms are devised for large sample sized data, thus they cannot be used in many biological settings where the number of observations rarely (or never in some cases) exceeds 100, but the number of features is in the order of tens of thousands. Finally, some packages are designed for high volume data
                <sup>
                    <xref ref-type="other" rid="FN1">1</xref>
                </sup> only.</p>
            <p>In this paper we present 
                <italic toggle="yes">MXM</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>; an R package that overcomes the above shortcomings. It contains many FS algorithms
                <sup>
                    <xref ref-type="other" rid="FN2">2</xref>
                </sup>, which can handle many and diverse types of target variables, while offering a pool of regression models to choose from and employ. There is a plethora of statistical tests (likelihood-ratio, Wald, permutation based) and information criteria (BIC and eBIC) to plug into the FS algorithms. Algorithms that work with small (and large) sample sized data, algorithms that have been customized for high volume data, and an algorithm that returns multiple sets of statistically equivalent features are some of the key characteristics of 
                <italic toggle="yes">MXM</italic>.</p>
            <p>Over the next sections, a brief qualitative comparison of 
                <italic toggle="yes">MXM</italic> with other packages available on 
                <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/MXM/index.html">CRAN</ext-link> and 
                <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/">Bioconductor</ext-link> is presented, its (dis)advantages are discussed, its FS algorithms and related functions are mentioned. Finally a demonstration takes place, applying some FS algorithms available in 
                <italic toggle="yes">MXM</italic> on real high dimensional data.</p>
        </sec>
        <sec>
            <title>The R package 
                <italic toggle="yes">MXM</italic>
            </title>
            <sec>
                <title>

                    <italic toggle="yes">MXM</italic> versus other R packages</title>
                <p>When searching for FS packages on 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/">CRAN</ext-link> and 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/">Bioconductor</ext-link> repositories using the keywords "feature selection", "variable selection", "selection", "screening" and "LASSO", we detected 184 R packages until the 7th of May 2018
                    <sup>
                        <xref ref-type="other" rid="FN3">3</xref>
                    </sup>. 
                    <xref ref-type="table" rid="T1">Table 1</xref> shows the frequency of the target variable types those packages accept, while 
                    <xref ref-type="fig" rid="f1">Figure 1</xref> shows the frequency of R packages whose FS algorithms can treat at least one type of target variable, of those presented in 
                    <xref ref-type="table" rid="T1">Table 1</xref>. 
                    <xref ref-type="table" rid="T2">Table 2</xref> presents the frequency of pairwise types of target variables offered in R packages and 
                    <xref ref-type="table" rid="T3">Table 3</xref> contains information on packages allowing for less frequent regression models. Most packages offer FS algorithms that are oriented towards specific types of target variables, methodology and regression models, offering at most 3-4 options. Out of these 184 packages, 65 (35.32%) offer LASSO type FS algorithms, while 19 (10.32%) address the problem of FS from the Bayesian perspective. Only 2 (1.08%) R packages treat the case of FS with multiple datasets while only 4 (2.17%) packages are devised for high volume data.</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Frequency of 
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/">CRAN</ext-link> and 
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/">Bioconductor</ext-link> FS related packages in terms of the target variable they accept.</title>
                        <p>The percentage-wise number appears inside the parentheses.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Target type</th>
                                <th align="center" colspan="1" rowspan="1">Binary</th>
                                <th align="center" colspan="1" rowspan="1">Nominal</th>
                                <th align="center" colspan="1" rowspan="1">Continuous</th>
                                <th align="center" colspan="1" rowspan="1">Counts</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Frequency (%)</td>
                                <td align="center" colspan="1" rowspan="1">107 (58.47%)</td>
                                <td align="center" colspan="1" rowspan="1">31 (16.93%)</td>
                                <td align="center" colspan="1" rowspan="1">120 (65.57%)</td>
                                <td align="center" colspan="1" rowspan="1">29 (15.85%)</td>
                            </tr>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Target type</th>
                                <th align="center" colspan="1" rowspan="1">Survival</th>
                                <th align="center" colspan="1" rowspan="1">Case-control</th>
                                <th align="center" colspan="1" rowspan="1">Ordinal</th>
                                <th align="center" colspan="1" rowspan="1">Multivariate</th>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">Frequency (%)</td>
                                <td align="center" colspan="1" rowspan="1">27 (14.75%)</td>
                                <td align="center" colspan="1" rowspan="1">3 (1.64%)</td>
                                <td align="center" colspan="1" rowspan="1">3 (2.19%)</td>
                                <td align="center" colspan="1" rowspan="1">11 (6.01%)</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Frequency of FS related R packages handling different types of target variables.</title>
                        <p>The horizontal axis shows the number of types (any combinations) of target variables from 
                            <xref ref-type="table" rid="T1">Table 1</xref>. For example, there 95 R packages that can handle only 1 type (any type) of target variable, 41 packages that can handle any 2 types of target variables, while 
                            <italic toggle="yes">MXM</italic> is the only one that handles all of them.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/17707/b3a2d543-c1e2-4425-ba28-3866b8e10525_figure1.gif"/>
                </fig>
                <table-wrap id="T2" orientation="portrait" position="anchor">
                    <label>Table 2. </label>
                    <caption>
                        <title>Cross-tabulation of the FS packages in R based on the target variable.</title>
                        <p>There are 108 packages which handle binary target variables, 59 packages offering algorithms for binary and continuous target variables and only one package handling ordinal and nominal target variables, etc.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th colspan="1" rowspan="1"/>
                                <th colspan="1" rowspan="1">Binary</th>
                                <th colspan="1" rowspan="1">Nominal</th>
                                <th colspan="1" rowspan="1">Continuous</th>
                                <th colspan="1" rowspan="1">Counts</th>
                                <th colspan="1" rowspan="1">Survival</th>
                                <th colspan="1" rowspan="1">Case-control</th>
                                <th colspan="1" rowspan="1">Ordinal</th>
                                <th colspan="1" rowspan="1">Multivariate</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Binary</td>
                                <td colspan="1" rowspan="1">
					
                                    <italic toggle="yes">108</italic>
				</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Nominal</td>
                                <td colspan="1" rowspan="1">
					
                                    <italic toggle="yes">32</italic>
				</td>
                                <td colspan="1" rowspan="1">32</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Continuous</td>
                                <td colspan="1" rowspan="1">
					
                                    <italic toggle="yes">59</italic>
				</td>
                                <td colspan="1" rowspan="1">13</td>
                                <td colspan="1" rowspan="1">120</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Counts</td>
                                <td colspan="1" rowspan="1">28</td>
                                <td colspan="1" rowspan="1">3</td>
                                <td colspan="1" rowspan="1">28</td>
                                <td colspan="1" rowspan="1">29</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Survival</td>
                                <td colspan="1" rowspan="1">18</td>
                                <td colspan="1" rowspan="1">5</td>
                                <td colspan="1" rowspan="1">17</td>
                                <td colspan="1" rowspan="1">7</td>
                                <td colspan="1" rowspan="1">27</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Case-control</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">3</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Ordinal</td>
                                <td colspan="1" rowspan="1">4</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">2</td>
                                <td colspan="1" rowspan="1">2</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">4</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Multivariate
                                    <break/>continuous</td>
                                <td colspan="1" rowspan="1">4</td>
                                <td colspan="1" rowspan="1">3</td>
                                <td colspan="1" rowspan="1">8</td>
                                <td colspan="1" rowspan="1">4</td>
                                <td colspan="1" rowspan="1">3</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">1</td>
                                <td colspan="1" rowspan="1">11</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <table-wrap id="T3" orientation="portrait" position="anchor">
                    <label>Table 3. </label>
                    <caption>
                        <title>Frequency of other types of regression models for FS treated by R packages on 
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/">CRAN</ext-link> and 
                            <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/">Bioconductor</ext-link>.</title>
                        <p>The percentage-wise number appears inside the parentheses.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Regression models</th>
                                <th align="center" colspan="1" rowspan="1">Robust</th>
                                <th align="center" colspan="1" rowspan="1">GLMM</th>
                                <th align="center" colspan="1" rowspan="1">GEE</th>
                                <th align="center" colspan="1" rowspan="1">Functional</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Frequency (%)</td>
                                <td align="center" colspan="1" rowspan="1">4 (2.19%)</td>
                                <td align="center" colspan="1" rowspan="1">8 (4.37%)</td>
                                <td align="center" colspan="1" rowspan="1">2 (1.09%)</td>
                                <td align="center" colspan="1" rowspan="1">2 (1.09%)</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
                <p>
                    <xref ref-type="table" rid="T4">Table 4</xref> summarizes the types of target variables treated by 
                    <italic toggle="yes">MXM</italic>&#x2019; FS algorithms along with the appropriate regression models that can be employed. The list is not exhaustive, as in some cases the type of the predictor variables (continuous or categorical) affects the decision of using a regression model or a test (Pearson and Spearman for continuous and 
                    <italic toggle="yes">G</italic>
                    <sup>2</sup> test of independence for categorical). With percentages for example, 
                    <italic toggle="yes">MXM</italic> offers numerous regression models to plug into its FS algorithms: beta regression, quasi binomial regression or any linear regression model (robust or not) after transforming the percentages using the logistic transformation. For repeated measurements (correlated data), there are two options offered, the GLMM and GEE methodologies which can also be used with various types of target variables, not mentioned here. We emphasize that 
                    <italic toggle="yes">MXM</italic> is the only package that covers all types of response variables mentioned on 
                    <xref ref-type="table" rid="T1">Table 1</xref>, many types of which are not available in any other FS package, such as left censored data for example. 
                    <italic toggle="yes">MXM</italic> also covers 3 out 4 cases that appear on 
                    <xref ref-type="table" rid="T3">Table 3</xref>.</p>
                <table-wrap id="T4" orientation="portrait" position="anchor">
                    <label>Table 4. </label>
                    <caption>
                        <title>A brief overview of the types of target variables and regression models in 
                            <italic toggle="yes">MXM</italic>.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Target variable type</th>
                                <th align="left" colspan="1" rowspan="1">Regression model or test</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">Continuous</td>
                                <td colspan="1" rowspan="1">Linear, MM and quantile regression, Pearson &amp; Spearman
                                    <break/>correlation coefficients</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Multivariate continuous</td>
                                <td colspan="1" rowspan="1">Multivariate linear regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">(Strictly) positive valued</td>
                                <td colspan="1" rowspan="1">Gaussian and Gamma regression with a log-link</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Percentages</td>
                                <td colspan="1" rowspan="1">Beta regression and quasi binomial regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Counts</td>
                                <td colspan="1" rowspan="1">Poisson, quasi Poisson, negative binomial and zero inflated
                                    <break/>Poisson regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Binary</td>
                                <td colspan="1" rowspan="1">Logistic regression, quasi binomial regression and 
                                    <italic toggle="yes">G</italic>
                                    <sup>2</sup> test of
                                    <break/>independence</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Nominal</td>
                                <td colspan="1" rowspan="1">Multinomial regression and 
                                    <italic toggle="yes">G</italic>
                                    <sup>2</sup> test of independence</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Ordinal</td>
                                <td colspan="1" rowspan="1">Ordinal regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Success out of a
                                    <break/>number of trials</td>
                                <td colspan="1" rowspan="1">Binomial regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Time-to-event</td>
                                <td colspan="1" rowspan="1">Cox, Weibull and exponential regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Matched case-control</td>
                                <td colspan="1" rowspan="1">Conditional logistic regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Left censored</td>
                                <td colspan="1" rowspan="1">Tobit regression</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">Repeated/clustered,
                                    <break/>longitudinal</td>
                                <td colspan="1" rowspan="1">Generalized linear mixed models (GLMM) and Generalized
                                    <break/>estimating equations (GEE)</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec>
                <title>Comparisons of 
                    <italic toggle="yes">MXM</italic>&#x2019; FS algorithms with other FS algorithms</title>
                <p>Most of the currently available FS algorithms in the 
                    <italic toggle="yes">MXM</italic> package have been developed by the creators and authors of the package. These algorithms have been tested and compared with other state-of-the-art algorithms under different scenarios and types of data.</p>
                <p>IAMB
                    <sup>
                        <xref ref-type="bibr" rid="ref-3">3</xref>
                    </sup> was on par with or outperforming competing machine learning algorithms, when both the target variable and features are categorical. MMPC and MMMB algorithms
                    <sup>
                        <xref ref-type="bibr" rid="ref-4">4</xref>
                    </sup> were tested in the context of BN learning showing great success with MMPC shown to achieve excellent false positive rates
                    <sup>
                        <xref ref-type="bibr" rid="ref-5">5</xref>
                    </sup>. MMPC was also used as the basis of MMHC
                    <sup>
                        <xref ref-type="bibr" rid="ref-6">6</xref>
                    </sup>, a prototypical algorithm for learning the structure of a Bayesian network which outperformed all other Bayesian network learning algorithms with categorical data. For time-to-event and nominal categorical target variable, MMPC 
                    <xref ref-type="bibr" rid="ref-7">7</xref>, and 
                    <xref ref-type="bibr" rid="ref-8">8</xref> respectively, outperformed or was on par with LASSO and other FS algorithms. SES was contrasted against LASSO
                    <sup>
                        <xref ref-type="bibr" rid="ref-2">2</xref>
                    </sup> with continuous, binary and survival target variables, resulting in similar conclusions as before. With temporal and time-course data, SES
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>
                    </sup> outperformed the LASSO algorithm
                    <sup>
                        <xref ref-type="bibr" rid="ref-10">10</xref>
                    </sup> both in predictive performance and computational efficiency. FBED
                    <sup>
                        <xref ref-type="bibr" rid="ref-11">11</xref>
                    </sup> was compared with LASSO for the task of binary classification with sparse data exhibiting performance similar to that of LASSO. gOMP, a generalization of OMP
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>&#x2013;
                        <xref ref-type="bibr" rid="ref-14">14</xref>
                    </sup>, has not been publicly tested, but our anecdotal experiments have showed very promising results, achieving similar or better performance, while enjoying higher computational efficiency than LASSO.</p>
            </sec>
            <sec>
                <title>Advantages and disadvantages of 
                    <italic toggle="yes">MXM</italic>&#x2019;s FS algorithms</title>
                <p>The main advantage of 
                    <italic toggle="yes">MXM</italic> is that all FS algorithms accept numerous and diverse types of target variables. MMPC, SES and FBED treat all types of target variables presented in 
                    <xref ref-type="table" rid="T4">Table 4</xref>, while gOMP handles fewer types
                    <sup>
                        <xref ref-type="other" rid="FN4">4</xref>
                    </sup>.</p>
                <p>

                    <italic toggle="yes">MXM</italic> is the only R package that offers many different regression models to be employed by the FS algorithms, even for the same type of response variable, such as Poisson, quasi Poisson, negative binomial and zero inflated Poisson regression for count data. For repeated measurements, the user has the option of using GLMM or the GEE methodology (the latter with more options in the correlation structure) and for time-to-event data, Cox, Weibull and exponential regression models are the available options.</p>
                <p>A range of statistical tests and methodologies to select the features is offered. Instead of the usual log-likelihood ratio test, the user has the option to use the Wald test or produce a p-value based on permutations. The latter is useful and advised when the sample size is small, emphasizing the need for use of MMPC and SES, both of which are designed for small sample sized datasets. FBED on the other hand gives the option of using information criteria, BIC
                    <sup>
                        <xref ref-type="bibr" rid="ref-15">15</xref>
                    </sup> or eBIC
                    <sup>
                        <xref ref-type="bibr" rid="ref-16">16</xref>
                    </sup>, instead of the log-likelihood ratio test.</p>
                <p>Statistically Equivalent Signatures (SES)
                    <sup>
                        <xref ref-type="bibr" rid="ref-2">2</xref>,
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup> builds upon the ideas of MMPC and returns multiple (statistically equivalent) sets of predictor variables, making it one one of the few FS algorithms suggested in the literature, and available in 
                    <ext-link ext-link-type="uri" xlink:href="https://www.google.com/search?q=hrefhttps%3A%2F%2Fcran.r-project.org%2FCRAN&amp;ie=utf-8&amp;oe=utf-8&amp;client=firefox-b-ab">hrefhttps://cran.r-project.org/CRAN</ext-link>, with this trait. 
                    <xref ref-type="bibr" rid="ref-18">18</xref> demonstrated that multiple, equivalent prognostic signatures for breast cancer can be extracted just by analyzing the same dataset with a different partition in training and test sets, showing the existence of several genes which are practically interchangeable in terms of predictive power. SES along with MMPC are two among the few algorithms, available on 
                    <ext-link ext-link-type="uri" xlink:href="https://www.google.com/search?q=hrefhttps%3A%2F%2Fcran.r-project.org%2FCRAN&amp;ie=utf-8&amp;oe=utf-8&amp;client=firefox-b-ab">hrefhttps://cran.r-project.org/CRAN</ext-link>, that can be used with multiple datasets in a meta-analytic way, following 
                    <xref ref-type="bibr" rid="ref-19">19</xref>.</p>
                <p>

                    <italic toggle="yes">MXM</italic> contains FS algorithms for small sample sized data (MMPC, MMMB, and SES) and for large sample sized data (FBED, gOMP). FBED and gOMP have been adopted for high volume data, going beyond the limits of R. The importance of these customizations can be appreciated by the fact that nowadays large scale datasets are more frequent than before. Since classical FS algorithms cannot handle such data, modifications must be made, in an algorithm level, in a memory efficient manner, in a computer architecture level, and/or in any other way. 
                    <italic toggle="yes">MXM</italic> is using an efficient memory handling R package.</p>
                <p>Finally, many utility functions are available, such as constructing a model from the object an algorithm returned, construct a model in general, communication between the input and outputs of the algorithms, long verbose output with useful information, etc. Using 
                    <italic toggle="yes">hash</italic> objects, the computational cost of MMPC and SES is significantly reduced. The univariate associations computed from MMPC, SES and FBED can be interchanged among them and save computational time.</p>
                <p>A disadvantage of most 
                    <italic toggle="yes">MXM</italic>&#x2019;s algorithms is their computational efficiency. Their (algorithmic) order of complexity is comparable to state-of-art FS algorithms, but the nature of the other algorithms is such that many regression models must be fit increasing the computational burden. gOMP, for example, is the most efficient algorithm available in 
                    <italic toggle="yes">MXM</italic>
                    <sup>
                        <xref ref-type="other" rid="FN5">5</xref>
                    </sup>, because it is residual based and few regression models are fit. However, with clustered/longitudinal data, SES (and MMPC) were shown to scale to tens of thousands and be dramatically faster than LASSO
                    <sup>
                        <xref ref-type="bibr" rid="ref-9">9</xref>
                    </sup>. Computational efficiency is also programming language-dependent. Most of the algorithms are currently written in R and we are constantly working towards transferring them to C++ so as to decrease the computational cost significantly.</p>
                <p>It is impossible to cover all cases of target variables; we have no algorithms for time series, and do not treat multi-state time-to-event target variables for example, yet we search for R packages that treat other types of target variables and link them to 
                    <italic toggle="yes">MXM</italic>. All algorithms are limited to linear or generalized linear relationships, but we will address this issue in the future. The gOMP algorithm does not accept all types of target variables and works only with continuous predictor variables. This is a limitation of the algorithm, but we plan to address this in the future as well.</p>
                <p>Cross-validation functions currently exist only for MMPC, SES and gOMP, but performance metrics are not available for all target variables. Left censored data, is an example of target variable whose predictive performance estimation is not offered. A last drawback is that currently 
                    <italic toggle="yes">MXM</italic> does not offer graphical visualization of the algorithms and of the models.</p>
            </sec>
            <sec>
                <title>Which FS algorithm from 
                    <italic toggle="yes">MXM</italic> to use and when</title>
                <p>In terms of sample size, FBED and gOMP are generally advised for large-sample-sized datasets, whereas MMPC and SES are designed mainly for small-sample-sized datasets
                    <sup>
                        <xref ref-type="other" rid="FN6">6</xref>
                    </sup>. In the case of a large sample size and few features, forward or backward selection are also suggested. In terms of number of features, gOMP is the only algorithm that scales up when the number of features is in the order of the hundreds of thousands. gOMP is also suitable for high volume data that contain a high number of features, really large sample sizes or both. FBED has been customized to handle high volume data as well, but with large sample sizes and only a few thousand features. If the user is interested in discovering more than one set of features, SES is suitable for returning multiple solutions, which are statistically equivalent. With multiple datasets, both MMPC and SES are currently the only two algorithms that can handle some cases (both the target variable and the set of features are continuous). As for the availability of the target variable, MMPC, SES and FBED handle all types of target variables available in 
                    <italic toggle="yes">MXM</italic>, listed in 
                    <xref ref-type="table" rid="T4">Table 4</xref>, while gOMP accepts fewer types of target variables. Regarding the type of features, gOMP currently works with continuous features only, whereas all other algorithms accept both continuous and categorical features. All this information is presented in 
                    <xref ref-type="table" rid="T5">Table 5</xref>.</p>
                <table-wrap id="T5" orientation="portrait" position="anchor">
                    <label>Table 5. </label>
                    <caption>
                        <title>Algorithm suggestion according to combinations of sample size (n) and number of features (p), multiple solutions and high volume data.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Algorithm</th>
                                <th align="center" colspan="6" rowspan="1">Cases</th>
                            </tr>
                            <tr>
                                <th colspan="1" rowspan="1"/>
                                <th align="center" colspan="1" rowspan="1">n small &amp;
                                    <break/>p small</th>
                                <th align="center" colspan="1" rowspan="1">n small &amp;
                                    <break/>p big</th>
                                <th align="center" colspan="1" rowspan="1">n big &amp;
                                    <break/>p small</th>
                                <th align="center" colspan="1" rowspan="1">n big &amp;
                                    <break/>p big</th>
                                <th align="center" colspan="1" rowspan="1">high volume
                                    <break/>data</th>
                                <th align="center" colspan="1" rowspan="1">multiple
                                    <break/>solutions</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td colspan="1" rowspan="1">MMPC</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">MMMB</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">SES</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">BSR</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">FSR</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">IAMB</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">FBED</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td colspan="1" rowspan="1">gOMP</td>
                                <td colspan="1" rowspan="1"/>
                                <td colspan="1" rowspan="1"/>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td align="center" colspan="1" rowspan="1">&#x2713;</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Implementation</title>
                <p>

                    <italic toggle="yes">MXM</italic> is an R package that makes use of (depends or imports) many other packages offering regression models</p>
                <list list-type="bullet">
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <italic toggle="yes">stats</italic> (built-in package): for generalised linear models.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/survival/index.html">survival</ext-link>: for survival regression.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/MASS/index.html">MASS</ext-link>: for negative binomial regression, ordinal regression and MM type regression.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/ordinal/index.html">ordinal</ext-link>: for ordinal regression.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/nnet/index.html">nnet</ext-link>: for multinomial regression.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/quantreg/index.html">quantreg</ext-link>: for quantile regression.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/lme4/index.html">lme4</ext-link>: for mixed models.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/geepack/index.html">geepack</ext-link>: for GEE models.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/coxme/index.html">coxme</ext-link>: for mixed survival regression models.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/bigmemory/index.html">bigmemory</ext-link>: for large volume data.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/doParallel/index.html">doParallel</ext-link>: for parallel computations.</p>
                    </list-item>
                    <list-item>
                        <label>&#x2022;</label>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/Rfast/index.html">Rfast</ext-link>: for computational efficiency.</p>
                    </list-item>
                </list>
                <p>To help gain computational efficiency, since 
                    <italic toggle="yes">MXM</italic> is not written in C++, 
                    <italic toggle="yes">MXM</italic> imports 
                    <italic toggle="yes">Rfast</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-21">21</xref>
                    </sup> which was initially created for this purpose. Currently, with little effort, one should be able to plug-in their own regression model into some of the algorithms. We plan to expand this possibility for all algorithms.</p>
            </sec>
            <sec>
                <title>FS-related functions</title>
                <p>

                    <italic toggle="yes">MXM</italic> contains functions for returning the selected features for a range of hyper-parameters for each algorithm. For example, 
                    <bold>mmpc.path</bold> runs MMPC for multiple combinations of 
                    <italic toggle="yes">threshold</italic> and 
                    <italic toggle="yes">max
                        <sub>k</sub>
                    </italic>, and 
                    <bold>gomp.path</bold> runs 
                    <bold>gOMP</bold> for a range of stopping values. The exception is with FBED, for which the user can give a vector of values of 
                    <italic toggle="yes">K</italic> in 
                    <bold>fbed.reg</bold> instead of a single value. Unfortunately, the path of significance levels cannot be determined at a single run.</p>
                <p>MMPC and SES have been implemented in such a way that the user has the option to store the results from a single run in a 
                    <italic toggle="yes">hash</italic> object. In subsequent runs, with different hyper-parameters this can lead to significant amounts of computational savings. These two algorithms give the user an extra advantage. They can search for the subset of feature(s) that rendered one more specific feature(s) independent of the target variable by using the function 
                    <bold>certificate.of.exclusion</bold>.</p>
                <p>FBED, SES and MMPC are three algorithms sharing some common ground. The list with the results of the univariate associations (test statistic and logged p-value) can be calculated from either algorithm and be passed onto any of them. When one is interested in running many algorithms, this can reduce the computational cost significantly. Note also that the univariate associations in MMPC and SES can be calculated in parallel, with multi-core machines. More FS related functions can be found in 
                    <italic toggle="yes">MXM</italic>&#x2019;s reference manual and vignettes section available on 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=MXM">CRAN</ext-link>.</p>
            </sec>
            <sec>
                <title>Operation</title>
                <p>

                    <italic toggle="yes">MXM</italic> is distributed as part of the CRAN R package repository and is compatible with Mac OS X, Windows, Solaris and Linux operating systems. Once the package is installed and loaded</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; install.packages("MXM")
&gt; library(MXM)</styled-content>
                    </preformat>
                </p>
                <p>it is ready to be used without internet connection. The system requirements are documented on 
                    <italic toggle="yes">MXM</italic>&#x2019;s webpage on 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=MXM">CRAN</ext-link>.</p>
            </sec>
        </sec>
        <sec>
            <title>Use cases</title>
            <p>With user-friendliness taken into consideration, extra attention has been put in keeping the functions within the MXM package as consistent as the nature of the algorithms allows for, in terms of syntax, required input objects and parameter arguments. 
                <xref ref-type="table" rid="T6">Table 6</xref> contains a list of the current FS algorithms, but we will demonstrate some of them here. In all cases, the arguments "target", "dataset" and "test" refer to the target variable, set of features and type of regression model to be used.</p>
            <table-wrap id="T6" orientation="portrait" position="anchor">
                <label>Table 6. </label>
                <caption>
                    <title>An overview of the main FS algorithms in 
                        <italic toggle="yes">MXM</italic>.</title>
                </caption>
                <table content-type="article-table" frame="hsides">
                    <thead>
                        <tr>
                            <th align="left" colspan="1" rowspan="1">Function</th>
                            <th align="left" colspan="1" rowspan="1">Algorithm</th>
                        </tr>
                    </thead>
                    <tbody>
                        <tr>
                            <td colspan="1" rowspan="1">MMPC</td>
                            <td colspan="1" rowspan="1">Max-Min Parents and Children (MMPC)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">SES</td>
                            <td colspan="1" rowspan="1">Statistically Equivalent Signatures (SES)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">mmmb</td>
                            <td colspan="1" rowspan="1">Max-Min Markov Blanket (MMMB)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">fs.reg</td>
                            <td colspan="1" rowspan="1">Forward selection (FSR)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">bs.reg</td>
                            <td colspan="1" rowspan="1">Backward selection (BSR)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">iamb</td>
                            <td colspan="1" rowspan="1">Incremental Association Markov Blanket (IAMB)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">fbed.reg</td>
                            <td colspan="1" rowspan="1">Forward-Backward with Early Dropping (FBED)</td>
                        </tr>
                        <tr>
                            <td colspan="1" rowspan="1">gomp</td>
                            <td colspan="1" rowspan="1">Generalized Orthogonal Matching Pursuit (gOMP)</td>
                        </tr>
                    </tbody>
                </table>
            </table-wrap>
            <p>We will use a variety of target variables and in some examples, we will show the results produced with different regression models. Under no circumstances should the following examples be considered experimental or for the purpose of comparison. They are only for the purpose of algorithms&#x2019; demonstration, to give examples of different types of target variables and to show how the algorithms work. All computations took place in a desktop computer with Intel Core i5-4690K CPU 
                <email xlink:href="mailto:@3.50GHz">@3.50GHz</email> and 32 GB RAM.</p>
            <sec>
                <title>Survival (or time-to-event) target variable</title>
                <p>The first dataset we used concerns breast cancer, with 295 women selected from the fresh-frozen&#x2013;tissue bank of the Netherlands Cancer Institute
                    <sup>
                        <xref ref-type="bibr" rid="ref-22">22</xref>
                    </sup>. The dataset contains 70 features and the target variable is time to event, with 63 censored values
                    <sup>
                        <xref ref-type="other" rid="FN7">7</xref>
                    </sup>. We need this information, to be passed as a numerical variable indicating the status (0 = censored, 1 = not censored), for example (1, 1, 0, 1, 1, 1, . . . ). We will make use of the R package 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/survival/index.html">survival</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-23">23</xref>
                    </sup> for running the appropriate models (Cox and Weibull regression) and show the FBED algorithm with the default arguments. Part of the output is presented below. Information on the selected features, their test statistic and associated logarithmically transformed p-value, along with some information on the number of regression models fitted is displayed.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; target &lt;- survival::Surv(y, status)
&gt; MXM::fbed.reg(target = target, dataset = dataset, test = "censIndCR")

$res
  sel     stat      pval
1  28 8.183389 -5.466128
2   6 5.527486 -3.978164

$info
    Number of vars Number of tests
K=0              2              73</styled-content>
                    </preformat>
                </p>
                <p>The above output was produced using Cox regression. If we used Weibull regression instead (
                    <italic toggle="yes">test = "testIndWR"</italic>), the output would be slightly different.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::fbed.reg(target = target, dataset = dataset, test = "censIndWR")

$res
     sel     stat      pval
Vars  28 8.489623 -5.634692

$info
    Number of vars Number of tests
K=0              1              75</styled-content>
                    </preformat>
                </p>
                <p>In order to avoid small p-values (less than the machine epsilon 10
                    <sup>&#x2212;16</sup>) being rounded to 0, their logarithm is computed and returned in the results. This is a crucial and key element of the algorithms because they rely on the correct ordering of the p-values.</p>
            </sec>
            <sec>
                <title>Unmatched case control target variable</title>
                <p>The second dataset we used again concerns breast cancer
                    <sup>
                        <xref ref-type="bibr" rid="ref-24">24</xref>
                    </sup> and contains 285 samples over 17,187 gene expressions (features). Since the target variable is binary, logistic regression was employed.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::gomp(target = target, dataset = dataset, test = "testIndLogistic")</styled-content>
                    </preformat>
                </p>
                <p>The element 
                    <italic toggle="yes">res</italic> presented below is one of the elements of the returned output. The first column shows the selected variables in order of inclusion and the second column is the deviance of each regression model. The first line refers to the regression model with 0 predictor variables (constant term only).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">$res
      Selected Vars  Deviance
 [1,]             0  332.55696
 [2,]          4509  156.33519
 [3,]         17606  131.04428
 [4,]          3856  113.78382
 [5,]         10101   95.76704
 [6,]         16759   80.25748
 [7,]          6466   67.78120
 [8,]         11524   54.54652
 [9,]          9794   44.17957
[10,]          4728   36.52319
[11,]          3620   20.48441
[12,]         13127   5.583645e-10</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Longitudinal data</title>
                <p>The next dataset we will use is 
                    <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9105">NCBI Gene Expression Omnibus accession number GSE9105</ext-link>
                    <sup>
                        <xref ref-type="bibr" rid="ref-25">25</xref>
                    </sup>, which contains 22,283 features about skeletal muscles from 12 normal, healthy glucose-tolerant individuals exposed to acute physiological hyperinsulinemia, measured at 3 distinct time points. Following 
                    <xref ref-type="bibr" rid="ref-9">9</xref>, we will also use SES and not FBED because the sample size is small. The grouping variable, identifying the subject along with the time points are necessary in our case. If the data are repeated measurements or clustered data, i.e. families, where no time is involved, the argument "reps" need not be provided. The user has the option to use GLMM
                    <sup>
                        <xref ref-type="bibr" rid="ref-26">26</xref>
                    </sup> or GEE
                    <sup>
                        <xref ref-type="bibr" rid="ref-27">27</xref>
                    </sup>.</p>
                <p>The output of SES (and of MMPC) is long and verbose, but we present the first 10 set of equivalent signatures. The first row is the set of selected features, and every other row is an equivalent set. In this example, the last four columns are the same and only the first changes. This means, that the feature 2683 has 9 statistically equivalent features, (2, 7, ..., 836, ,1117).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::SES.temporal(target = target, reps = reps, group = group,
                         dataset = dataset, test = "testIndGLMMReg")
@signatures[1:10,]
      Var1 Var2 Var3  Var4  Var5
 [1,] 2683 6155 9414 13997 21258
 [2,]    2 6155 9414 13997 21258
 [3,]    7 6155 9414 13997 21258
 [4,]   10 6155 9414 13997 21258
 [5,]   18 6155 9414 13997 21258
 [6,]  213 6155 9414 13997 21258
 [7,]  393 6155 9414 13997 21258
 [8,]  699 6155 9414 13997 21258
 [9,]  836 6155 9414 13997 21258
[10,] 1117 6155 9414 13997 21258</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Continuous target variable</title>
                <p>The next dataset we consider is from Human cerebral organoids recapitulate gene expression programs of fetal neocortex development
                    <sup>
                        <xref ref-type="bibr" rid="ref-28">28</xref>
                    </sup>. The data are pre-processed RNA-seq, thus continuous data, with 729 samples and 58,037 features. We selected the first feature as the target variable and all the rest were considered to be the features. In this case we used FBED and gOMP, employing the Pearson correlation coefficient because all measurements are continuous.</p>
                <p>FBED performed 123, 173 tests and selected 63 features.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::fbed.reg(target = target, dataset = dataset, test = "testIndFisher")

$info
    Number of vars Number of tests
K=0             63          123173</styled-content>
                    </preformat>
                </p>
                <p>gOMP on the other has was more parsimonious, selecting only 8 features. At this point we must highlight the fact that the selection of a feature was based on the adjusted 
                    <italic toggle="yes">R</italic>
                    <sup>2</sup> value. If the increase in the adjusted 
                    <italic toggle="yes">R</italic>
                    <sup>2</sup> due to the candidate feature was more than 0.01 or (1/%), the feature was selected.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::gomp(target = target, dataset = dataset, test = "testIndFisher",
method = "ar2", tol = 0.01)

$res
       Vars adjusted R2
 [1,]     0   0.0000000
 [2,] 11394   0.3056431
 [3,]  4143   0.4493530
 [4,] 49524   0.4744709
 [5,]     8   0.4936872
 [6,] 29308   0.5096887
 [7,]  8619   0.5287238
 [8,]  3194   0.5411237
 [9,]  5958   0.5513510</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Count data</title>
                <p>The final example is on discrete valued target variable (count data) for which Poisson and quasi-Poisson regression models will be employed by the gOMP algorithm. The dataset with GEO accession number GSE47774
                    <sup>
                        <xref ref-type="bibr" rid="ref-29">29</xref>
                    </sup> contains RNA-seq data with 256 samples and 43,919 features. We selected the first feature to be the target variable and all the rest are the features.</p>
                <p>We ran gOMP using Poisson (
                    <italic toggle="yes">test="testIndPois"</italic>) and quasi Poisson (
                    <italic toggle="yes">test="testIndQPois"</italic>) regression models, but we changed the stopping value to 
                    <italic toggle="yes">tol=12</italic>. Due to over-dispersion (variance &gt; mean), quasi Poisson is appropriate
                    <sup>
                        <xref ref-type="other" rid="FN8">8</xref>
                    </sup> because Poisson regression assumes these two quantities are equal. When Poisson was used, 107 features were selected; since the wrong model was used, many false positive features were included, while with the quasi Poisson regression only 10 were selected.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#000000;">&gt; MXM::gomp(target = target, dataset = dataset, test = "testIndQPois",
tol = 12)

$res
      Selected Vars   Deviance
 [1,]             0 3821661.14
 [2,]          6391  145967.17
 [3,]         12844  129639.56
 [4,]         26883  113706.51
 [5,]         32680  108387.15
 [6,]         29370  102407.46
 [7,]          4274   96817.48
 [8,]         43570   91373.77
 [9,]         43294   86125.30
[10,]         31848   81659.51
[11,]         38299   77295.71</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Applications of SES and gOMP</title>
                <p>The case of ordinal target variable (i.e. very low, low, high, very high) has been treated previously
                    <sup>
                        <xref ref-type="bibr" rid="ref-30">30</xref>
                    </sup> for unrevealing interesting features measuring the user perceived quality of experience with YouTube video streaming applications applications and the Quality of Service (target variable) of the underlying network under different network conditions.</p>
                <p>Most recently, SES and gOMP were applied in the field of fisheries for identifying the genetic SNP loci that are associated with certain phenotypes of the gilthead seabream (Sparus aurata)
                    <sup>
                        <xref ref-type="bibr" rid="ref-31">31</xref>
                    </sup>. Measurements from multiple cultured seabream families were taken, thus the data are correlated and GLMM had to be applied. Several of the discovered genes have already been associated with growth in other teleosts or even mice, such as genes 
                    <italic toggle="yes">MBD5</italic>, 
                    <italic toggle="yes">ACVRIIA</italic> and 
                    <italic toggle="yes">IRF7</italic>. The study led to a catalogue of genetic markers that set the ground for understanding growth and other traits of interest in Gilthead seabream, in order to maximize the aquaculture yield.</p>
            </sec>
        </sec>
        <sec>
            <title>Summary</title>
            <p>We presented the R package 
                <italic toggle="yes">MXM</italic> and some of its feature selection algorithms. We discussed its advantages and disadvantages and compared it, at a high level, with other competing R packages. We then demonstrated, using real high-dimensional data with a diversity of types of target variables, four FS algorithms, including different regression models in some cases.</p>
            <p>The package is constantly being updated with new functions and improvements being added and algorithms being transferred to C++ to decrease the computational cost. Computational efficiency was mentioned as one of 
                <italic toggle="yes">MXM</italic>&#x2019; disadvantage which we are trying to address. However, computational efficiency is one aspect, and flexibility another. To this end we plan to add of more regression models, more functionalities, options and graphical visualizations.</p>
        </sec>
        <sec>
            <title>Data availability</title>
            <list list-type="bullet">
                <list-item>
                    <label>&#x2022;</label>
                    <p>The first dataset we used (survival target variable) is available from 
                        <ext-link ext-link-type="uri" xlink:href="http://ccb.nki.nl/data/">Computational Cancer Biology</ext-link>.</p>
                </list-item>
                <list-item>
                    <label>&#x2022;</label>
                    <p>The second dataset we used (unmatched case control target variable) is available from 
                        <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=gse2034">GEO</ext-link>.</p>
                </list-item>
                <list-item>
                    <label>&#x2022;</label>
                    <p>The third dataset we used (longitudinal data) is available from 
                        <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9105">GEO</ext-link>.</p>
                </list-item>
                <list-item>
                    <label>&#x2022;</label>
                    <p>The fourth dataset we used (continuous target variable) is available from 
                        <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75140">GEO</ext-link>.</p>
                </list-item>
                <list-item>
                    <label>&#x2022;</label>
                    <p>The fifth dataset we used (count data) is available from 
                        <ext-link ext-link-type="uri" xlink:href="https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE47792">GEO</ext-link>.</p>
                </list-item>
            </list>
        </sec>
        <sec>
            <title>Software availability</title>
            <p>MXM is available from: 
                <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/MXM/index.html">https://cran.r-project.org/web/packages/MXM/index.html</ext-link>.</p>
            <p>Source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/cran/MXM">https://github.com/cran/MXM</ext-link>.</p>
            <p>Archived source code at time of publication: 
                <ext-link ext-link-type="uri" xlink:href="http://doi.org/10.5281/zenodo.1410043">http://doi.org/10.5281/zenodo.1410043</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-32">32</xref>
                </sup>.</p>
            <p>License: 
                <ext-link ext-link-type="uri" xlink:href="https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html">GPL-2</ext-link>.</p>
        </sec>
    </body>
    <back>
        <ack>
            <title>Acknowledgments</title>
            <p>We would like to acknowledge Stefanos Fafalios, Zacharias Papadovasilakis, Christina Chatzipantsiou, Kleio-Maria Verrou, and Manos Papadakis for their constructive feedback.</p>
        </ack>
        <fn-group>
            <fn id="FN1">
                <p>
                    <sup>1</sup>In statistics and in the R packages the term "big data" is used to refer to such data. In the computer science terminology, big data are of much higher volume and require specific technology. For this reason we chose to use the term "high volume" instead of "big data".</p>
            </fn>
            <fn id="FN2">
                <p>
                    <sup>2</sup>MXM is mainly FS oriented, but it offers (Bayesian) network learning algorithms as well. Many feature selection algorithms offered in 
                    <italic toggle="yes">MXM</italic> are Bayesian network inspired.</p>
            </fn>
            <fn id="FN3">
                <p>
                    <sup>3</sup>We highlight the fact that especially on 
                    <ext-link ext-link-type="uri" xlink:href="https://www.google.com/search?q=hrefhttps%3A%2F%2Fcran.r-project.org%2FCRAN&amp;ie=utf-8&amp;oe=utf-8&amp;client=firefox-b-ab">hrefhttps://cran.r-project.org/CRAN</ext-link>, packages are uploaded at a super-linear rate. Bioconductor is more strict with the addition of new packages. The phenomenon of abandoned or not maintained packages for a long time is not at all unusual. Such an example is "biospear", removed from 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/">CRAN</ext-link> (archived) in the 30th of April 2018. On the other hand we added a package that performs FS without mentioning this in its title.</p>
            </fn>
            <fn id="FN4">
                <p>
                    <sup>5</sup>For this long list of available target variables and regression models, expanding 
                    <xref ref-type="table" rid="T4">Table 4</xref>, see 
                    <ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/web/packages/MXM/vignettes/FS_guide.pdf">Guide on performing FS with the R package 
                        <italic toggle="yes">MXM</italic>
                    </ext-link>.</p>
            </fn>
            <fn id="FN5">
                <p>
                    <sup>5</sup>In our anecdotal experiments it has superseded the LASSO implementation in the package 
                    <italic toggle="yes">glmnet</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-20">20</xref>
                    </sup> in both time and performance.</p>
            </fn>
            <fn id="FN6">
                <p>
                    <sup>6</sup>To the best of our knowledge there are not many FS algorithms dealing with small sample sized data.</p>
            </fn>
            <fn id="FN7">
                <p>
                    <sup>7</sup>Censoring occurs when partial information about some observations is available. It might be the case that some individuals will experience the event after completion of the study. Or when an individual is not part of the study for anymore, for a reason other than the occurrence of the event of interest. In a study about cancer, for example, some patients may die of another cause, e.g. another disease or car accident for example. The survival times of those patients has been recorded, but offer limited information.</p>
            </fn>
            <fn id="FN8">
                <p>
                    <sup>8</sup>Negative binomial regression, 
                    <italic toggle="yes">test="testIndNB"</italic> is another alternative option.</p>
            </fn>
        </fn-group>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aliferis</surname>
                            <given-names>CF</given-names>
                        </name>
</person-group>:
                    <article-title>Towards principled feature selection: relevancy, filters and wrappers.</article-title>In
                    <source>

                        <italic toggle="yes">AISTATS.</italic>
</source>
                    <year>2003</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dsl-lab.org/ml_tutorial_old/Publications/aistats2003.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Athineou</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Farcomeni</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Feature Selection with the R Package MXM: Discovering Statistically-Equivalent Feature Subsets.</article-title>
                    <source>

                        <italic toggle="yes">J Stat Softw.</italic>
</source>
                    <year>2017</year>;<volume>80</volume>(<issue>7</issue>).
                    <pub-id pub-id-type="doi">10.18637/jss.v080.i07</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aliferis</surname>
                            <given-names>CF</given-names>
                        </name>

                        <name name-style="western">
                            <surname> Statnikov</surname>
                            <given-names>AR</given-names>
                        </name>
</person-group>:
                    <article-title>Algorithms for Large Scale Markov Blanket Discovery.</article-title>In
                    <source>

                        <italic toggle="yes">FLAIRS Conference.</italic>
</source>
                    <year>2003</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.aaai.org/Papers/FLAIRS/2003/Flairs03-073.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aliferis</surname>
                            <given-names>CF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Statnikov</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>Time and sample efficient discovery of Markov Blankets and direct causal relations.</article-title>In
                    <source>

                        <italic toggle="yes">Proceedings of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.</italic>
</source>ACM.<year>2003</year>;<fpage>673</fpage>&#x2013;<lpage>678</lpage>.
                    <pub-id pub-id-type="doi">10.1145/956750.956838</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aliferis</surname>
                            <given-names>CF</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Statnikov</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>samardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Local causal and Markov Blanket induction for causal discovery and feature selection for classification part II: Analysis and extensions.</article-title>
                    <source>

                        <italic toggle="yes">J Mach Learn Res.</italic>
</source>
                    <year>2010</year>;<volume>11</volume>:<fpage>235</fpage>&#x2013;<lpage>284</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://jmlr.csail.mit.edu/papers/volume11/aliferis10b/aliferis10b.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brown</surname>
                            <given-names>LE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Aliferis</surname>
                            <given-names>CF</given-names>
                        </name>
</person-group>:
                    <article-title>The Max-Min Hill-Climbing Bayesian network structure learning algorithm.</article-title>
                    <source>

                        <italic toggle="yes">Mach Learn.</italic>
</source>
                    <year>2006</year>;<volume>65</volume>(<issue>1</issue>):<fpage>31</fpage>&#x2013;<lpage>78</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s10994-006-6889-7</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Structure-based variable selection for survival data.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2010</year>;<volume>26</volume>(<issue>15</issue>):<fpage>1887</fpage>&#x2013;<lpage>1894</lpage>.
                    <pub-id pub-id-type="pmid">20519286</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btq261</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kortas</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Biomarker signature identification in "omics" data with multi-class outcome.</article-title>
                    <source>

                        <italic toggle="yes">Comput Struct Biotechnol J.</italic>
</source>
                    <year>2013</year>;<volume>6</volume>(<issue>7</issue>):<fpage>e201303004</fpage>.
                    <pub-id pub-id-type="pmid">24688712</pub-id>
                    <pub-id pub-id-type="doi">10.5936/csbj.201303004</pub-id>
                    <pub-id pub-id-type="pmcid">3962136</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsagris</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Feature selection for high-dimensional temporal data.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2018</year>;<volume>19</volume>(<issue>1</issue>):<fpage>17</fpage>.
                    <pub-id pub-id-type="pmid">29357817</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12859-018-2023-7</pub-id>
                    <pub-id pub-id-type="pmcid">5778658</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Groll</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tutz</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Variable selection for generalized linear mixed models by 
                        <italic toggle="yes">L</italic>
                        <sup>1</sup>-penalized estimation.</article-title>
                    <source>

                        <italic toggle="yes">Stat Comput.</italic>
</source>
                    <year>2014</year>;<volume>24</volume>(<issue>2</issue>):<fpage>137</fpage>&#x2013;<lpage>154</lpage>.
                    <pub-id pub-id-type="doi">10.1007/s11222-012-9359-z</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Borboudakis</surname>
                            <given-names>G</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>
</person-group>:
                    <article-title>Forward-backward selection with early dropping.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv: 1705.10770.</italic>
</source>
                    <year>2017</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/pdf/1705.10770.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Billings</surname>
                            <given-names>SA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Luo</surname>
                            <given-names>W</given-names>
                        </name>
</person-group>:
                    <article-title>Orthogonal least squares methods and their application to non-linear system identification.</article-title>
                    <source>

                        <italic toggle="yes">Int J Control.</italic>
</source>
                    <year>1989</year>;<volume>50</volume>(<issue>5</issue>):<fpage>1873</fpage>&#x2013;<lpage>1896</lpage>.
                    <pub-id pub-id-type="doi">10.1080/00207178908953472</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pati</surname>
                            <given-names>YC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rezaiifar</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Krishnaprasad</surname>
                            <given-names>PS</given-names>
                        </name>
</person-group>:
                    <article-title>Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition.</article-title>In
                    <italic toggle="yes">Signals, Systems and Computers, 1993. 1993 Conference Record of The Twenty-Seventh Asilomar Conference on.</italic>IEEE.<year>1993</year>;<fpage>40</fpage>&#x2013;<lpage>44</lpage>.
                    <pub-id pub-id-type="doi">10.1109/ACSSC.1993.342465</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Davis</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Adaptive nonlinear approximations.</article-title>PhD thesis, New York University, Graduate School of Arts and Science,<year>1994</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.geoffdavis.net/papers/dissertation.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Schwarz</surname>
                            <given-names>G</given-names>
                        </name>
</person-group>:
                    <article-title>Estimating the dimension of a model.</article-title>
                    <source>

                        <italic toggle="yes">Ann Stat.</italic>
</source>
                    <year>1978</year>;<volume>6</volume>(<issue>2</issue>):<fpage>461</fpage>&#x2013;<lpage>464</lpage>.
                    <pub-id pub-id-type="doi">10.1214/aos/1176344136</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>Z</given-names>
                        </name>
</person-group>:
                    <article-title>Extended bayesian information criteria for model selection with large model spaces.</article-title>
                    <source>

                        <italic toggle="yes">Biometrika.</italic>
</source>
                    <year>2008</year>;<volume>95</volume>(<issue>3</issue>):<fpage>759</fpage>&#x2013;<lpage>771</lpage>.
                    <pub-id pub-id-type="doi">10.1093/biomet/asn034</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pappas</surname>
                            <given-names>D</given-names>
                        </name>
</person-group>:
                    <article-title>Discovering multiple, equivalent biomarker signatures.</article-title>In
                    <italic toggle="yes">Proceedings of the 7th conference of the Hellenic Society for Computational Biology  Bioinformatics</italic>, Heraklion, Crete, Greece.<year>2012</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.mensxmachina.org/files/publications/MultipleSignatureHSCBB12v03.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ein-Dor</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kela</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Getz</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Outcome signature genes in breast cancer: is there a unique set?</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2005</year>;<volume>21</volume>(<issue>2</issue>):<fpage>171</fpage>&#x2013;<lpage>178</lpage>.
                    <pub-id pub-id-type="pmid">15308542</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bth469</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Karozou</surname>
                            <given-names>AD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gomez-Cabrero</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A comparative evaluation of data-merging and meta-analysis methods for reconstructing gene-gene interactions.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2016</year>;<volume>17 Suppl 5</volume>:<fpage>194</fpage>.
                    <pub-id pub-id-type="pmid">27294826</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12859-016-1038-1</pub-id>
                    <pub-id pub-id-type="pmcid">4905611</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Friedman</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hastie</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tibshirani</surname>
                            <given-names>R</given-names>
                        </name>
</person-group>:
                    <article-title>Regularization Paths for Generalized Linear Models via Coordinate Descent.</article-title>
                    <source>

                        <italic toggle="yes">J Stat Softw.</italic>
</source>
                    <year>2010</year>;<volume>33</volume>(<issue>1</issue>):<fpage>1</fpage>&#x2013;<lpage>22</lpage>.
                    <pub-id pub-id-type="pmid">20808728</pub-id>
                    <pub-id pub-id-type="pmcid">2929880</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Papadakis</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsagris</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dimitriadis</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Rfast: A Collection of Efficient and Extremely Fast R Functions</article-title>. R package version 1.9.0.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=Rfast">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>van de Vijver</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>He</surname>
                            <given-names>YD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>van&#x2019;t Veer</surname>
                            <given-names>LJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A gene-expression signature as a predictor of survival in breast cancer.</article-title>
                    <source>

                        <italic toggle="yes">N Engl J Med.</italic>
</source>
                    <year>2002</year>;<volume>347</volume>(<issue>25</issue>):<fpage>1999</fpage>&#x2013;<lpage>2009</lpage>.
                    <pub-id pub-id-type="pmid">12490681</pub-id>
                    <pub-id pub-id-type="doi">10.1056/NEJMoa021967</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Therneau</surname>
                            <given-names>TM</given-names>
                        </name>
</person-group>:
                    <article-title>A Package for Survival Analysis in R.</article-title>version 2.42-6.<year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://CRAN.R-project.org/package=survival">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Klijn</surname>
                            <given-names>JG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer.</article-title>
                    <source>

                        <italic toggle="yes">Lancet.</italic>
</source>
                    <year>2005</year>;<volume>365</volume>(<issue>9460</issue>):<fpage>671</fpage>&#x2013;<lpage>679</lpage>.
                    <pub-id pub-id-type="pmid">15721472</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <label>25</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Coletta</surname>
                            <given-names>DK</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Balas</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chavez</surname>
                            <given-names>AO</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Effect of acute physiological hyperinsulinemia on gene expression in human skeletal muscle 
                        <italic toggle="yes">in vivo</italic>.</article-title>
                    <source>

                        <italic toggle="yes">Am J Physiol Endocrinol Metab.</italic>
</source>
                    <year>2008</year>;<volume>294</volume>(<issue>5</issue>):<fpage>E910</fpage>&#x2013;<lpage>E917</lpage>.
                    <pub-id pub-id-type="pmid">18334611</pub-id>
                    <pub-id pub-id-type="doi">10.1152/ajpendo.00607.2007</pub-id>
                    <pub-id pub-id-type="pmcid">3581328</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bates</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>M&#x00e4;chler</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bolker</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Fitting linear mixed-effects models using lme4.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv: 1406.5823.</italic>
</source>
                    <year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/pdf/1406.5823.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>H&#x00f8;jsgaard</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Halekoh</surname>
                            <given-names>U</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yan</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>Package geepack</article-title>. R package version 1.2-0.<year>2015</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://mirror.kakao.com/CRAN/web/packages/geepack/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Camp</surname>
                            <given-names>JG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Badsha</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Florio</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Human cerebral organoids recapitulate gene expression programs of fetal neocortex development.</article-title>
                    <source>

                        <italic toggle="yes">Proc Natl Acad Sci U S A.</italic>
</source>
                    <year>2015</year>;<volume>112</volume>(<issue>51</issue>):<fpage>15672</fpage>&#x2013;<lpage>15677</lpage>.
                    <pub-id pub-id-type="pmid">26644564</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.1520760112</pub-id>
                    <pub-id pub-id-type="pmcid">4697386</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <collab>SEQC/MAQC-III Consortium</collab>:
                    <article-title>A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.</article-title>
                    <source>

                        <italic toggle="yes">Nat Biotechnol.</italic>
</source>
                    <year>2014</year>;<volume>32</volume>(<issue>9</issue>):<fpage>903</fpage>&#x2013;<lpage>14</lpage>.
                    <pub-id pub-id-type="pmid">25150838</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nbt.2957</pub-id>
                    <pub-id pub-id-type="pmcid">4321899</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Katsarakis</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Teixeira</surname>
                            <given-names>RC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Papadopouli</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Towards a causal analysis of video qoe from network and application qos.</article-title>In
                    <italic toggle="yes">Proceedings of the 2016 workshop on QoE-based Analysis and Management of Data Communication Networks.</italic>ACM.<year>2016</year>;<fpage>31</fpage>&#x2013;<lpage>36</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://dl.acm.org/citation.cfm?id=2940142">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <label>31</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kyriakis</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kanterakis</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Manousaki</surname>
                            <given-names>T</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Scanning of genetic variants and genetic mapping of phenotypic traits in gilthead seabream (sparus aurata).</article-title>
                    <source>

                        <italic toggle="yes">In preparation.</italic>
</source>
                    <year>2018</year>.</mixed-citation>
            </ref>
            <ref id="ref-32">
                <label>32</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Tsagris</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsamardinos</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lagani</surname>
                            <given-names>V</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Feature Selection (Including Multiple Solutions) and Bayesian Networks (Version 1.3.9).</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2018</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.doi.org/10.5281/zenodo.1410043">http://www.doi.org/10.5281/zenodo.1410043</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report42926">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17707.r42926</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Qiu</surname>
                        <given-names>Huitong</given-names>
                    </name>
                    <xref ref-type="aff" rid="r42926a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r42926a1">
                    <label>1</label>Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>28</day>
                <month>1</month>
                <year>2019</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2019 Qiu H</copyright-statement>
                <copyright-year>2019</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport42926" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.16216.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>The manuscript introduces a new R package, MXM, that offers a variety of feature selection algorithms in regression models. The new package presents relevant contribution to the toolbox of feature selection algorithms by covering more types of target variables than most existing packages, accommodating data with small or big sample sizes, and providing additional functionalities and utility features. The manuscript also demonstrates the usage of several functions in the package by analyzing real data.</p>
            <p> </p>
            <p> Regarding the presentation of the manuscript, I share the same concern with reviewer Thodoris that the acronyms in the paper come with poor explanation. Although several of the acronyms are explained in Table 4 and Table 6, most of appearances of these acronyms have no reference to these tables or explanations. This makes the paper unnecessarily hard to read.</p>
            <p> </p>
            <p> Secondly, I think the demonstration of the usage of the package could be more informative with more interpretations. For example, how can we interpret the p-values, adjusted R-squares, and deviance in the results of the model fittings? High level descriptions of the algorithms behind the demonstrated functions could also help the reader better understand the results.</p>
            <p> </p>
            <p> Below are comments to specific places in the manuscript: 
                <list list-type="bullet">
                    <list-item>
                        <p>The 3rd paragraph in Introduction mentions that statistical tests (likelihood ratio, Wald, permutation based) can be plugged into the feature selection algorithms that work with small sample sized data. The aforementioned tests traditionally rely on large sample theory. I wonder what adjustments are needed to accommodate small sample sizes?</p>
                    </list-item>
                    <list-item>
                        <p>Tables 1, 2, and Figure 1 shows statistics of target variable types supported by a number of R packages. What algorithm was used to identify the types of target variables accepted by an R package?</p>
                    </list-item>
                    <list-item>
                        <p>What does &#x201c;multiple datasets&#x201d; mean as in &#x201c;Only 2 (1.08%) R packages teat the case of FS with multiple datasets ...&#x201d; (last sentence of Paragraph 1 on Page 5)?</p>
                    </list-item>
                    <list-item>
                        <p>I find the last paragraph on Page 5 particularly uneasy to read due to the acronyms used without explanation. Some of these are later explained in Table 6 while a couple are not (e.g., MMHC). I think it would be helpful to make a comprehensive table explaining the algorithm acronyms used in this paper, and refer to the table here.</p>
                    </list-item>
                    <list-item>
                        <p>In the last paragraph on Page 5, there are multiple places that compare algorithms in terms of &#x201c;(predictive) performance&#x201d;. I think it would be more informative to clarify what performance metrics we are looking at.</p>
                    </list-item>
                    <list-item>
                        <p>In Paragraph 5 on Page 6, the manuscript states that &#x201c;SES ... returns multiple (statistically equivalent) sets of predictor variables ...&#x201d;. What does statistically equivalent mean?</p>
                    </list-item>
                    <list-item>
                        <p>What does the package name, MXM, stand for?</p>
                    </list-item>
                    <list-item>
                        <p>The second last paragraph on Page 6 is a bit confusing to me. The paragraph starts with pointing out computational efficiency as a disadvantage of MXM. However, this is followed by explaining why gOMP is efficient, and pointing out that SES and MMPC scales better and run faster than LASSO package. These seem like advantages in computational efficiency.</p>
                    </list-item>
                    <list-item>
                        <p>In Paragraph 2 of &#x201c;FS-related functions&#x201d; section: for MMPC and SES, storing the results from one run and passing it to subsequent runs can lead to significant computational savings. Are there savings because the trained models serve as good starting points in the subsequent runs?</p>
                    </list-item>
                    <list-item>
                        <p>On Page 9 in the output of MXM::fbed.reg, there are p-values associated with each selected variable. How can we interpret these p-values? It seems that these p-values are not adjusted for the extra degrees of freedom from feature selection.</p>
                    </list-item>
                </list> Below are some minor suggestions/typo fixes: 
                <list list-type="bullet">
                    <list-item>
                        <p>In Abstract &#x2013; &#x201c;The R package MXM is such an example, which offers ...&#x201d;: It&#x2019;s unclear to me what &#x201c;such an example&#x201d; refers to. It&#x2019;s clearer to state &#x201c;The R package MXM offers ...&#x201d; directly.</p>
                    </list-item>
                    <list-item>
                        <p>Second paragraph in Introduction: &#x201c;For example, packages that accept few or specific types of target variables.&#x201d; &#x2192; &#x201c;For example, some packages accept few or specific types of target variables.&#x201d; (make it a complete sentence.)</p>
                    </list-item>
                    <list-item>
                        <p>In &#x201c;MXM versus other R packages&#x201d; section: &#x201c;... can treat at least one type of target variable, ...&#x201d; &#x2192; &#x201c;... can treat at least k types of target variables, for k = 1, 2, ..., 8, ...&#x201d;</p>
                    </list-item>
                    <list-item>
                        <p>In the last paragraph on Page 5, does &#x201c;BN&#x201d; refer to Bayesian network (which also appears in the same paragraph)?</p>
                    </list-item>
                    <list-item>
                        <p>In Paragraph 5 on Page 6 &#x2013; &#x201c;... making it one one of the few FS algorithms ...&#x201d;: remove the extra &#x201c;one&#x201d;.</p>
                    </list-item>
                    <list-item>
                        <p>In Paragraph 6 on Page6 &#x2013; &#x201c;MXM is using an efficient memory handling R package.&#x201d;: please cite the package.</p>
                    </list-item>
                    <list-item>
                        <p>Last Paragraph on Page 7: &#x201c;generalised&#x201d; &#x2192; &#x201c;generalized&#x201d; to be consistent with other places in the paper.</p>
                    </list-item>
                    <list-item>
                        <p>First paragraph in Section &#x201c;Use cases&#x201d;: the argument &#x201c;test&#x201d; refers to the type of regression model to be used. Why not name the argument something like &#x201c;model type&#x201d;?</p>
                    </list-item>
                    <list-item>
                        <p>Some of the citations are not easily distinguishable from footnotes. e.g., citation 21 for Rfast on Page 8 and footnote 7 are both superscripts.</p>
                    </list-item>
                    <list-item>
                        <p>Paragraph 2 on Page 11: &#x201c;gOMP on the other has ...&#x201d; &#x2192; &#x201c;gOMP on the other hand ...&#x201d;.</p>
                    </list-item>
                    <list-item>
                        <p>First paragraph on Page 12 &#x2013; &#x201c;YouTube video streaming applications applications&#x201d;: duplicate &#x201c;applications&#x201d;.</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Partly</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>statistical machine learning, feature selection, high dimensional data, graphical models, time series analysis, clinical trial design</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment4748-42926">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Tsagris</surname>
                            <given-names>Michail</given-names>
                        </name>
                        <aff>University of Crete, Greece</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>12</day>
                    <month>7</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We are grateful to the reviewers for their on-the-spot comments which we have addressed.&#x00a0;</p>
                <p>The manuscript introduces a new R package, MXM, that offers a variety of feature selection algorithms in regression models. The new package presents relevant contribution to the toolbox of feature selection algorithms by covering more types of target variables than most existing packages, accommodating data with small or big sample sizes, and providing additional functionalities and utility features. The manuscript also demonstrates the usage of several functions in the package by analyzing real data. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: Regarding the presentation of the manuscript, I share the same concern with reviewer Thodoris that the acronyms in the paper come with poor explanation. Although several of the acronyms are explained in Table 4 and Table 6, most of appearances of these acronyms have no reference to these tables or explanations. This makes the paper unnecessarily hard to read.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added the interpretation of the acronyms when they &#xfb01;rst appear, at various places within the text.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Secondly, I think the demonstration of the usage of the package could be more informative with more interpretations. For example, how can we interpret the p-values, adjusted R-squares, and deviance in the results of the model &#xfb01;ttings? High level descriptions of the algorithms behind the demonstrated functions could also help the reader better understand the results.&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this comment. We have added a small description of each algorithm does in Section &#x201c;The MXM&#x2019;s FS algorithms and comparison with other FS algorithms&#x201d;. This was also highlighted by Prof Kypraios. Also, in Section &#x201c;Advantages and disadvantages of MXM&#x2019;s FS algorithms&#x201d; we have added a small paragraph regarding the p-values produced by the algorithms.</p>
                        </list-item>
                    </list> Below are comments to speci&#xfb01;c places in the manuscript: 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: The 3rd paragraph in Introduction mentions that statistical tests(likelihood ratio, Wald, permutation based) can be plugged into the feature selection algorithms that work with small sample sized data. The aforementioned tests traditionally rely on large sample theory. I wonder what adjustments are needed to accommodate small sample sizes?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a foot note in page 5 regarding this. There as on we added this information at this point was because we discuss the suitability of the algorithms based on the sample size. For example, we say the MMPC is more suitable for small sample sized data and explain why. FBED on the other hand is devised for large sample sizes.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Tables 1, 2, and Figure 1 shows statistics of target variable types supported by a number of R packages. What algorithm was used to identify the types of target variables accepted by an R package?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We did this manually. We searched on CRAN and went through all packages one by one. We did this process twice to make sure we did not omit any package. Bear in mind that this search was made a few months ago.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: What does &#x201c;multiple datasets&#x201d; mean as in &#x201c;Only 2 (1.08%) R packages teat the case of FS with multiple datasets ...&#x201d; (last sentence of Paragraph 1 on Page 5)?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We added a footnote at this point explaining what we mean by multiple datasets and how we perform feature selection in this case. The 1.08% means that only 2 R packages (available on CRAN or Bioconductor) perform feature selection with multiple datasets.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: I &#xfb01;nd the last paragraph on Page 5 particularly uneasy to read due to the acronyms used without explanation. Some of these are later explained in Table 6 while a couple are not (e.g., MMHC). I think it would be helpful to make a comprehensive table explaining the algorithm acronyms used in this paper, and refer to the table here.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this comment. We have explained all acronyms in various place, whenever the relevant algorithm is &#xfb01;rst mentioned.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In the last paragraph on Page 5, there are multiple places that compare algorithms in terms of &#x201c;(predictive)performance&#x201d;. I think it would be more informative to clarify what performance metrics we are looking at.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a couple a sentence mentioning some predictive performance metrics for some target variables.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In Paragraph 5 on Page 6, the manuscript states that &#x201c;SES ... returns multiple (statistically equivalent) sets of predictor variables ...&#x201d;. What does statistically equivalent mean?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: This comment was raised by the other reviewer as well. In the Abstract we have added a sentence inside parentheses brie&#xfb02;y explain this. Also in Section &#x201c;The MXM&#x2019;s FS algorithms and comparison with other FS algorithms&#x201d; we have added one sentence when mention the SES algorithm (this is in magenta colour). This comment was also made by the &#xfb01;rst reviewer and we have answered this previously.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: What does the package name, MXM, stand for?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this comment. We added the meaning of this acronym in the 4rth footnote of page 2. It stands for Mens ex Machina.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: The second last paragraph on Page 6 is a bit confusing to me. The paragraph starts with pointing out computational ef&#xfb01;ciency as a disadvantage of MXM. However, this is followed by explaining why gOMP is ef&#xfb01;cient, and pointing out that SES and MMPC scales better and run faster than LASSO package. These seem like advantages in computational ef&#xfb01;ciency.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this clari&#xfb01;cation. Indeed the message is diluted. We have reworded this point, towards the end of page 5.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In Paragraph 2 of &#x201c;FS-related functions&#x201d; section: for MMPC and SES, storing the results from one run and passing it to subsequent runs can lead to signi&#xfb01;cant computational savings. Are there savings because the trained models serve as good starting points in the subsequent runs?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a few sentences clarifying this point. All p-values are stored to avoid implementing all tests again in subsequent runs.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: On Page 9 in the output of MXM::fbed.reg, there are p-values associated with each selected variable. How can we interpret these p-values? It seems that these p-values are not adjusted for the extra degrees of freedom from feature selection.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We do recognize this problem and are aware that the distribution of the test statistics is not the assumed one [1] and their associated p-values can be small [2]. We have added a small discussion on the p-values at the bottom of page 4-top of page 5 regarding this. The p-values are not FDR corrected, but we refer to the papers that mention this issue as well. With MMPC for example, there is no need to apply FDR because the algorithm controls the FDR [3]. [4] discusses methods that address this issue and they conclude that the FBED algorithm is orthogonal to those methods and could be used in conjunction with them. Closing this issue we will mention that it is not easy to control FDR in the context of feature selection; it is an open problem. It is commonly accepted in the statistical literature, and not only, that this leads to regression coef&#xfb01;cients are overestimated. A solution would be to &#xfb01;t a LASSO regularised model, for example, and tune the penalty algorithm via cross-validation. However, the problem of FDR is not totally addressed, but rather bypassed.</p>
                        </list-item>
                    </list> Below are some minor suggestions/typo &#xfb01;xes: 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: In Abstract &#x2013; &#x201c;The R package MXM is such an example, which offers ...&#x201d;: It&#x2019;s unclear to me what &#x201c;such an example&#x201d; refers to. It&#x2019;s clearer to state &#x201c;The R package MXM offers ...&#x201d; directly.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this suggestion. We have reworded this sentence.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Second paragraph in Introduction: &#x201c;For example, packages that accept few or speci&#xfb01;c types of target variables.&#x201d; -&gt; &#x201c;For example, some packages accept few or speci&#xfb01;c types of target variables.&#x201d; (make it a complete sentence.)</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this syntactical mistake which we have now corrected.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In &#x201c;MXM versus other R packages&#x201d; section: &#x201c;... can treat at least one type of target variable, ...&#x201d; -&gt; &#x201c;... can treat at least k types of target variables, for k = 1, 2, ..., 8, ...&#x201d;</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have changed this sentence as requested.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In the last paragraph on Page 5, does &#x201c;BN&#x201d; refer to Bayesian network (which also appears in the same paragraph)?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We substituted the BN to Bayesian network.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In Paragraph 5 on Page 6 &#x2013; &#x201c;... making it one one of the few FS algorithms ...&#x201d;: remove the extra &#x201c;one&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have removed the extra &#x201c;one&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: In Paragraph 6 on Page6 &#x2013; &#x201c;MXM is using an ef&#xfb01;cient memory handling R package.&#x201d;: please cite the package.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have mentioned and cited the relevant package.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Last Paragraph on Page 7: &#x201c;generalised&#x201d; -&gt; &#x201c;generalized&#x201d; to be consistent with other places in the paper.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We changed everything to &#x201c;generalised&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: First paragraph in Section &#x201c;Use cases&#x201d;: the argument &#x201c;test&#x201d; refers to the type of regression model to be used. Why not name the argument something like &#x201c;model type&#x201d;?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have reworded this part of the manuscript and added some more information about its target.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Some of the citations are not easily distinguishable from footnotes. e.g., citation 21 for Rfast on Page 8 and footnote 7 are both superscripts.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We did not spot this issue in the paper. #</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Paragraph 2 on Page 11: &#x201c;gOMP on the other has ...&#x201d; -&gt; &#x201c;gOMP on the other hand ...&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We changed this as requested by the &#xfb01;rst reviewer also.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: First paragraph on Page 12 &#x2013; &#x201c;YouTubevideostreamingapplicationsapplications&#x201d;: duplicate &#x201c;applications&#x201d;.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We removed the duplicate.</p>
                        </list-item>
                    </list> [1]&#x00a0;Hastie Trevor, Tibshirani Robert and Friedman JH.&#x00a0;The elements of statistical learning: data mining, inference, and prediction. New York: Springer, 2009.</p>
                <p>[2]&#x00a0;Frank E Harrell Jr.&#x00a0;Regression modeling strategies. Springer. 2017.</p>
                <p>[3]&#x00a0;Ioannis Tsamardinos and Laura E Brown.Bounding the False Discovery Rate in Local Bayesian Network Learning.&#x00a0;In AAAI, pages 110-1105, 2008.</p>
                <p>[4]&#x00a0;Giorgos Borboudakis and Ioannis Tsamardinos.&#x00a0;Forward-backward selection with early dropping.&#x00a0;Journal of Machine Learning Research,&#x00a0;20(8):1&#x2013;39, 2019.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report38571">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.17707.r38571</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Kypraios</surname>
                        <given-names>Thodoris</given-names>
                    </name>
                    <xref ref-type="aff" rid="r38571a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r38571a1">
                    <label>1</label>School of Mathematical Sciences, University of Nottingham, Nottingham, UK</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>8</day>
                <month>10</month>
                <year>2018</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2018 Kypraios T</copyright-statement>
                <copyright-year>2018</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport38571" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.16216.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>
                <bold>Summary</bold>
            </p>
            <p> </p>
            <p> The paper is concerned with the method of feature selection using the R package MXM. The package appears to be fairly versatile in the sense that it can handle a huge variety of types of data. It can be very useful for applied researchers and, at the very least, it is another tool in the toolbox for the applied statistician.&#x00a0;</p>
            <p> </p>
            <p> I believe that the paper it will be a useful addition in the literature. However, I found difficult to read/understand in places. For example, is the MXM a package that only includes methodology that has been developed by the authors or does it include methods that have been developed by other researchers too? Either way is fine, but it would be helpful to clarify.</p>
            <p> </p>
            <p> Another major issue for me is that the paper is flooded with acronyms that I believe most users will be unfamiliar with. It would improve the paper's presentation significantly, if there is some explanation (a couple of sentences per method would suffice) about each of them.&#x00a0;</p>
            <p> </p>
            <p> Below are some more specific points to consider:&#x00a0;</p>
            <p> </p>
            <p> 
                <bold>Abstract</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>&#x00a0;It it not clear when one first reads what "b) it contains a variety of regression models to plug into the feature selection algorithms;" means.</p>
                    </list-item>
                    <list-item>
                        <p>c), "equivalent" in what sense?</p>
                    </list-item>
                    <list-item>
                        <p>&#x00a0;"it includes memory efficient algorithms for high volume data, data that cannot be loaded into R" -&gt; "it includes memory efficient algorithms for high volume data that often cannot be loaded easily into R"?&#x00a0;</p>
                    </list-item>
                </list> </p>
            <p> 
                <bold>Introduction</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>"and easier to understand and interpret;" -&gt; "and often easier to understand and interpret;"</p>
                    </list-item>
                    <list-item>
                        <p>&#x00a0;"small portion" -&gt; "small proportion"?&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>It would be good to Table 1 to have the totals.</p>
                    </list-item>
                    <list-item>
                        <p>"These algorithms have been tested and compared with other state-of-the-art algorithms under different scenarios and types of data." Any references?</p>
                    </list-item>
                    <list-item>
                        <p>I find hard to read the "Comparisons of MXM&#x2019; FS algorithms with other FS algorithms" because there are a bunch of acronyms used that I (and presumably other readers) don't know what they mean; it would be good to say a few words about what each algorithm is doing.</p>
                    </list-item>
                    <list-item>
                        <p>&#x00a0;"anecdotal" -&gt; preliminary?</p>
                    </list-item>
                    <list-item>
                        <p>It would be good to say what is g-OMP (as well as the other methods)&#x00a0;doing?</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> 
                <bold>Methods</bold> 
                <list list-type="bullet">
                    <list-item>
                        <p>In the example "Survival (or time-to-event) target variable", a survival model is fitted and the MXM package is used and its output is presented, but I am unclear as to what is the objective (in terms of the data analysis) and what does the output mean.</p>
                    </list-item>
                    <list-item>
                        <p>The above comments applies to the above datasets; it is crucial that reader knows what is the statistical objective first, and then to explain what the outcome of the package means.&#x00a0;</p>
                    </list-item>
                    <list-item>
                        <p>page 11: "gOMP on the other" -&gt; "gOMP on the other _hand_" ?</p>
                    </list-item>
                </list>
            </p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Partly</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment4747-38571">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Tsagris</surname>
                            <given-names>Michail</given-names>
                        </name>
                        <aff>University of Crete, Greece</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>12</day>
                    <month>7</month>
                    <year>2019</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We are grateful to the reviewers for their on-the-spot comments.</p>
                <p>The paper is concerned with the method of feature selection using the R package MXM. The package appears to be fairly versatile in the sense that it can handle a huge variety of types of data. It can be very useful for applied researchers and, at the very least, it is another tool in the toolbox for the applied statistician. 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: I believe that the paper it will be a useful addition in the literature. However, I found dif&#xfb01;cult to read/understand in places. For example, is the MXM a package that only includes methodology that has been developed by the authors or does it include methods that have been developed by other researchers too? Either way is &#xfb01;ne, but it would be helpful to clarify.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added an extra column in Table 5 giving this information. Also, in Section &#x201c;The MXM&#x2019;s FS algorithms and comparison with other FS algorithms&#x201d; where we describe the algorithms we have added the relevant references.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: Another major issue for me is that the paper is &#xfb02;ooded with acronyms that I believe most users will be unfamiliar with. It would improve the paper&#x2019;s presentation signi&#xfb01;cantly, if there is some explanation (a couple of sentences per method would suf&#xfb01;ce) about each of them.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added the explanation of all acronyms when the algorithm is &#xfb01;rst mentioned. Also in Section &#x201c;The MXM&#x2019;s FS algorithms and comparison with other FS algorithms&#x201d; we brie&#xfb02;y mention how they work.</p>
                        </list-item>
                    </list> Below are some more speci&#xfb01;c points to consider:</p>
                <p>Abstract 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: It it not clear when one &#xfb01;rst reads what "b)it contains a variety of regression models to plug into the feature selection algorithms;" means.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a sentence inside parentheses giving some examples of target variables and appropriate regression models explaining this sentence.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: c), "equivalent" in what sense?&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We added a sentence inside parentheses explaining this sentence. Equivalent refers to the&#x201c;information&#x201d; a feature contains. If for example the information gain of a variable is not signi&#xfb01;cant it could be attributed to another feature that contains the same information as this one. Think for example collinear variables, such as length of left hand and length of the right hand. Either hand can be used in a regression model. This phenomenon is prevalent, but also not highly examined, in bioinformatics where the selected genes may not be the ones expected by the biologists, only because some equivalent genes were selected. Only few feature selection algorithms return all equivalent features and SES is one of them.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: "it includes memory ef&#xfb01;cient algorithms for high volume data, data that cannot be loaded into R" -&gt; "it includes memory ef&#xfb01;cient algorithms for high volume data that often cannot be loaded easily into R"?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a sentence inside parentheses explaining this sentence. The term &#x201c;big data&#x201d; has wrongfully been used in many instances to denote large scale data. What we mean by high volume data is data whose size in Gb equals or exceeds the size of the available RAM in one&#x2019;s computer and hence cannot be loaded into R.</p>
                        </list-item>
                    </list> Introduction 
                    <list list-type="bullet">
                        <list-item>
                            <p>&#x2022;Comment: "and easier to understand and interpret;" -&gt; "and often easier to understand and interpret;" &#x2022; Reply: We thank the reviewer for this grammatical suggestion. We changed it. &#x2022; Comment: "small portion" -&gt; "small proportion"?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We did not change this word. We consider the word &#x201c;portion&#x201d; as a synonym for &#x201c;share&#x201d;. The word &#x201c;proportion&#x201d; is rather further away from our interpretation and the meaning of the word we wanted to use.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: It would be good to Table 1 to have the totals.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We added the total (184 packages) in the caption of Table 1.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: "These algorithms have been tested and compared with other state-of-the-art algorithms under different scenarios and types of data." Any references?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: For each algorithm separately, we have added the relevant reference.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: I &#xfb01;nd hard to read the "Comparisons of MXM&#x2019; FS algorithms with other FS algorithms" because there are a bunch of acronyms used that I (and presumably other readers) don&#x2019;t know what they mean; it would be good to say a few words about what each algorithm is doing.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this comment. We have expanded this section by adding a short description for each algorithm.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: "anecdotal" -&gt; preliminary?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We changed the whole sentence and refer to the relevant paper that is a draft paper available on bioRxiv.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: It would be good to say what is g-OMP (as well as the other methods) doing?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We have added a short description of what each feature selection algorithm does.</p>
                        </list-item>
                    </list> Methods 
                    <list list-type="bullet">
                        <list-item>
                            <p>Comment: In the example "Survival (or time-to-event) target variable", a survival model is &#xfb01;tted and the MXM package is used and its output is presented, but I am unclear as to what is the objective (in terms of the data analysis) and what does the output mean.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We thank the reviewer for this important comment. We have modi&#xfb01;ed this example and the others. In addition, we have added a paragraph in the beginning of this section explaining the goal of this section.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: The above comments applies to the above datasets; it is crucial that reader knows what is the statistical objective &#xfb01;rst, and then to explain what the outcome of the package means.</p>
                        </list-item>
                        <list-item>
                            <p>Reply: Please see previous comment.</p>
                        </list-item>
                        <list-item>
                            <p>Comment: page 11: "gOMP on the other" -&gt; "gOMP on the other hand"?</p>
                        </list-item>
                        <list-item>
                            <p>Reply: We added the word &#x201c;hand&#x201d;.&#x00a0;</p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
</article>
