<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="other" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.11968.1</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Software Tool Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                    <subj-group>
                        <subject>Bioinformatics</subject>
                    </subj-group>
                    <subj-group>
                        <subject>Genomics</subject>
                    </subj-group>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>cophesim: A comprehensive phenotype simulator for testing novel association methods</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 1; peer review: 2 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Zhbannikov</surname>
                        <given-names>Ilya Y.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-6502-6514</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Arbeev</surname>
                        <given-names>Konstantin G.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Yashin</surname>
                        <given-names>Anatoliy I.</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Biodemography of Aging Research Unit (BARU) at Social Sciences Research Institute (SSRI), Duke University, Durham, NC, 27705, USA</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:ilya.zhbannikov@duke.edu">ilya.zhbannikov@duke.edu</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:konstantin.arbeev@duke.edu">konstantin.arbeev@duke.edu</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>1</day>
                <month>8</month>
                <year>2017</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2017</year>
            </pub-date>
            <volume>6</volume>
            <elocation-id>1294</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>25</day>
                    <month>7</month>
                    <year>2017</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Zhbannikov IY et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/6-1294/pdf"/>
            <abstract>
                <p>Simulation is important in evaluating novel methods when input data is not easily obtainable or specific assumptions are needed. We present 
                    <italic toggle="yes">cophesim</italic>, a software to add the phenotype to generated genotype data prepared with a genetic simulator. The output of 
                    <italic toggle="yes">cophesim</italic> can be used as a direct input for different genome wide association study tools. 
                    <italic toggle="yes">cophesim</italic> is available from 
                    <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim">https://bitbucket.org/izhbannikov/cophesim</ext-link>.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>Phenotype simulation</kwd>
                <kwd>GWAS</kwd>
            </kwd-group>
            <funding-group>
                <award-group id="fund-1" xlink:href="http://dx.doi.org/10.13039/100000049">
                    <funding-source>National Institute on Aging</funding-source>
                    <award-id>P01AG043352</award-id>
                    <award-id>R01AG046860</award-id>
                    <award-id>P30AG034424</award-id>
                </award-group>
                <funding-statement>This work was supported by the National Institute on Aging of the National Institutes of Health (NIA/NIH) under award numbers P01AG043352, R01AG046860, and P30AG034424</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>Genome-wide association studies (GWAS) are routine in population research. New methods are being developed for better accessing complex associations between genotypes and phenotypes, uncovering genotype structures or testing evolutionary hypotheses. Testing the novel methods requires experimental data, which may not be easily obtainable. One solution is to use artificial data simulated with specific assumptions.</p>
            <p>The best existing phenotype simulators, such as: 
                <italic toggle="yes">GENOME</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-1">1</xref>
                </sup>, 
                <italic toggle="yes">Plink</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-2">2</xref>
                </sup>, 
                <italic toggle="yes">phenosim</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-3">3</xref>
                </sup>, 
                <italic toggle="yes">CoaSim</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-4">4</xref>
                </sup>, 
                <italic toggle="yes">Fregene</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-5">5</xref>
                </sup>, 
                <italic toggle="yes">ForSim</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-6">6</xref>
                </sup>, 
                <italic toggle="yes">QuantiNemo</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-7">7</xref>
                </sup>, 
                <italic toggle="yes">GCTA</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-8">8</xref>
                </sup>, 
                <italic toggle="yes">HapGen</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-9">9</xref>
                </sup>, 
                <italic toggle="yes">SeqSimla</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-10">10</xref>
                </sup>, and 
                <italic toggle="yes">SimRare</italic>
                <sup>
                    <xref ref-type="bibr" rid="ref-11">11</xref>
                </sup> offer qualitative and dichotomous simulated phenotype. But the known phenotype simulation software tools have some limitations, which may prevent customers from using them: (i) the majority, if not all, of the phenotype simulation software tools do not offer simulation of survival traits/time-to-event outcome, making it impossible to test respective hypotheses of associations; (ii) some of the tools are not easy to use, due to wide range of parameters, which the user has to provide and control (rather than calculate them automatically), making them unnecessarily difficult to use and preventing the user from future use of the tool; (iii) phenotype simulation is often offered as an auxiliary part of the genetic simulation routine, and therefore the user first has to perform a time-consuming unavoidable genetic simulation in order to obtain the phenotype; (iv) in situations when the genetic data is already simulated from other tools, only 
                <italic toggle="yes">phenosim</italic> and 
                <italic toggle="yes">GCTA</italic> offer adding simulated phenotype to such data. Consequently, it is necessary to have a new, simple and flexible phenotype simulation tool with plain algorithmic assumptions.</p>
            <p>Consequently, we present 
                <italic toggle="yes">cophesim</italic>, a comprehensive phenotype simulation tool that was developed to add a phenotype to corresponding genotypes simulated by other simulation tool (
                <xref ref-type="other" rid="SF1">Table S1</xref>). 
                <italic toggle="yes">cophesim</italic> offers simulation of continuous, dichotomous and survival traits, with different (user-provided) effect sizes of causal variants, with the ability to simulate epistatic interactions. It also can simulate phenotype within gene-environment interaction assumptions using up to 10 covariates.</p>
        </sec>
        <sec sec-type="methods">
            <title>Methods</title>
            <sec>
                <title>Implementation</title>
                <p>The workflow (see 
                    <xref ref-type="fig" rid="f1">Figure 1</xref>) includes the following stages: (i) Input data pre-processing; (ii) phenotype simulation; (iii) generation of final output files.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <p>Workflow of 
                            <italic toggle="yes">cophesim</italic> has three stages: (1) Input stage, where the input data (can be provided in one of the three formats: 
                            <italic toggle="yes">Plink</italic>, 
                            <italic toggle="yes">MS</italic> and 
                            <italic toggle="yes">GENOME</italic>, see the user manual - 
                            <xref ref-type="other" rid="SF2">Supplementary File 1</xref>) along with the other input parameters (such as causal variants with size effects, output format, etc.) is prepared for phenotype simulation; (2) Phenotype simulation stage, where different types of phenotypic traits are simulated: dichotomous, continuous and time-to-event (&#x2018;survival&#x2019;); (3) Output stage &#x2013; the final stage, where simulated phenotype data are packed to various formats in order to be directly usable by six GWAS tools: 
                            <italic toggle="yes">EMMAX</italic>, 
                            <italic toggle="yes">BLOSSOC</italic>, 
                            <italic toggle="yes">Plink</italic>, 
                            <italic toggle="yes">QTDT</italic>, 
                            <italic toggle="yes">TASSEL</italic> and 
                            <italic toggle="yes">GenABEL</italic>. Summary statistics are generated at the output stage as well.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12938/139a6dfe-3ead-4a3c-b6cd-eae867165496_figure1.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Input data</title>
                <p>Currently 
                    <italic toggle="yes">cophesim</italic> accepts the genotype output data from 
                    <italic toggle="yes">Plink</italic>, 
                    <italic toggle="yes">MS</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-12">12</xref>
                    </sup> and 
                    <italic toggle="yes">GENOME</italic> software applications. Phenotypes (dichotomous, continuous and survival) are then added according to the following simulation scenarios.</p>
            </sec>
            <sec>
                <title>Dichotomous phenotype</title>
                <p>Dichotomous phenotype for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual (
                    <italic toggle="yes">i</italic> = 1...
                    <italic toggle="yes">N</italic>, where 
                    <italic toggle="yes">N</italic> is the total number of individuals in a dataset) is simulated according to the logistic model (if the user provided effect sizes for causal variants): 
                    <disp-formula id="e1">
                        <mml:math display="block" id="math1">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>p</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mfrac>
                                    <mml:mn>1</mml:mn>
                                    <mml:mrow>
                                        <mml:mn>1</mml:mn>
                                        <mml:mo>+</mml:mo>
                                        <mml:msup>
                                            <mml:mi>e</mml:mi>
                                            <mml:mrow>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>z</mml:mi>
                                                    <mml:mi>i</mml:mi>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:msup>
                                    </mml:mrow>
                                </mml:mfrac>
                                <mml:mspace width="3em"/>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mn>1</mml:mn>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> where 
                    <italic toggle="yes">p
                        <sub>i</sub>
                    </italic> is the probability of a particular outcome. In 
                    <italic toggle="yes">cophesim</italic>, it is a probability of a &#x201c;case&#x201d; (cases are marked by &#x201c;1&#x201d;, and &#x201c;0&#x201d; are controls in simulated dichotomous phenotype) for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual. If 
                    <italic toggle="yes">p
                        <sub>i</sub>
                    </italic> is greater than the some threshold 
                    <italic toggle="yes">p</italic>
                    <sub>0</sub> (we use 
                    <italic toggle="yes">p</italic>
                    <sub>0</sub> ~ 
                    <italic toggle="yes">U</italic>(0, 1)), then the phenotype for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual is set to &#x201c;1&#x201d; and to &#x201c;0&#x201d; otherwise. The variable 
                    <bold>z</bold> is determined with the following equation: 
                    <disp-formula id="e2">
                        <mml:math display="block" id="math2">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>z</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>M</mml:mi>
                                    </mml:munderover>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>E</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:msub>
                                        <mml:msub>
                                            <mml:mi>g</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>i</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo>+</mml:mo>
                                        <mml:mstyle displaystyle="true">
                                            <mml:munderover>
                                                <mml:mo>&#x2211;</mml:mo>
                                                <mml:mrow>
                                                    <mml:mi>j</mml:mi>
                                                    <mml:mo>=</mml:mo>
                                                    <mml:mn>1</mml:mn>
                                                </mml:mrow>
                                                <mml:mi>K</mml:mi>
                                            </mml:munderover>
                                            <mml:mrow>
                                                <mml:msub>
                                                    <mml:mi>&#x03b1;</mml:mi>
                                                    <mml:mi>j</mml:mi>
                                                </mml:msub>
                                                <mml:msub>
                                                    <mml:mi>X</mml:mi>
                                                    <mml:mrow>
                                                        <mml:mi>i</mml:mi>
                                                        <mml:mi>j</mml:mi>
                                                    </mml:mrow>
                                                </mml:msub>
                                                <mml:mo>+</mml:mo>
                                                <mml:msub>
                                                    <mml:mi>&#x03f5;</mml:mi>
                                                    <mml:mi>i</mml:mi>
                                                </mml:msub>
                                            </mml:mrow>
                                        </mml:mstyle>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:mspace width="3em"/>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mn>2</mml:mn>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> 
                    <italic toggle="yes">E
                        <sub>j</sub>
                    </italic> &#x2013; effect size for 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> variant, user-defined; 
                    <italic toggle="yes">g
                        <sub>ij</sub>
                    </italic> &#x2013; value of 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> genetic marker for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual; 
                    <italic toggle="yes">&#x03b1;
                        <sub>j</sub>
                    </italic> - effect size for 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> covariate and 
                    <italic toggle="yes">X
                        <sub>ij</sub>
                    </italic> is a value of 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> covariate for a 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual (the term 
                    <mml:math id="math3">
                        <mml:mrow>
                            <mml:mstyle displaystyle="true">
                                <mml:msubsup>
                                    <mml:mo>&#x2211;</mml:mo>
                                    <mml:mrow>
                                        <mml:mi>j</mml:mi>
                                        <mml:mo>=</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                    <mml:mi>K</mml:mi>
                                </mml:msubsup>
                                <mml:mrow>
                                    <mml:msub>
                                        <mml:mi>&#x03b1;</mml:mi>
                                        <mml:mi>j</mml:mi>
                                    </mml:msub>
                                    <mml:msub>
                                        <mml:mi>X</mml:mi>
                                        <mml:mrow>
                                            <mml:mi>i</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:mrow>
                                    </mml:msub>
                                </mml:mrow>
                            </mml:mstyle>
                        </mml:mrow>
                    </mml:math> is added to represent gene-environment iterations); 
                    <italic toggle="yes">&#x03f5;
                        <sub>i</sub>
                    </italic> &#x2013; a standard normal residual, 
                    <italic toggle="yes">&#x03f5;
                        <sub>i</sub> N</italic> (0, 1), computed for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual, 
                    <italic toggle="yes">M</italic> is a total number of genetic variants and 
                    <italic toggle="yes">K</italic> is a total number of covariates used.</p>
                <p>If the user did not provide the effect sizes for causal variants, the following strategy is then used: 
                    <disp-formula id="e3">
                        <mml:math display="block" id="math4">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>z</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>M</mml:mi>
                                    </mml:munderover>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>w</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>i</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:mo>+</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>K</mml:mi>
                                    </mml:munderover>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>&#x03b1;</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:msub>
                                        <mml:msub>
                                            <mml:mi>X</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>i</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo>+</mml:mo>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:msub>
                                    <mml:mi> &#x03f5;</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mspace width="3em"/>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mn>3</mml:mn>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula>
                </p>
                <p>Here 
                    <italic toggle="yes">w
                        <sub>ij</sub>
                    </italic> is a weight and computed as follows: 
                    <mml:math id="math5">
                        <mml:mrow>
                            <mml:msub>
                                <mml:mi>w</mml:mi>
                                <mml:mrow>
                                    <mml:mi>i</mml:mi>
                                    <mml:mi>j</mml:mi>
                                </mml:mrow>
                            </mml:msub>
                            <mml:mo>=</mml:mo>
                            <mml:mfrac>
                                <mml:mrow>
                                    <mml:msub>
                                        <mml:mi>g</mml:mi>
                                        <mml:mrow>
                                            <mml:mi>i</mml:mi>
                                            <mml:mi>j</mml:mi>
                                        </mml:mrow>
                                    </mml:msub>
                                    <mml:mo>&#x2212;</mml:mo>
                                    <mml:mn>2</mml:mn>
                                    <mml:mi>M</mml:mi>
                                    <mml:mi>A</mml:mi>
                                    <mml:msub>
                                        <mml:mi>F</mml:mi>
                                        <mml:mi>j</mml:mi>
                                    </mml:msub>
                                </mml:mrow>
                                <mml:mrow>
                                    <mml:msup>
                                        <mml:mrow>
                                            <mml:mo stretchy="false">(</mml:mo>
                                            <mml:mn>2</mml:mn>
                                            <mml:mi>M</mml:mi>
                                            <mml:mi>A</mml:mi>
                                            <mml:msub>
                                                <mml:mi>F</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:msub>
                                            <mml:mo stretchy="false">(</mml:mo>
                                            <mml:mn>1</mml:mn>
                                            <mml:mo>&#x2212;</mml:mo>
                                            <mml:mi>M</mml:mi>
                                            <mml:mi>A</mml:mi>
                                            <mml:msub>
                                                <mml:mi>F</mml:mi>
                                                <mml:mi>j</mml:mi>
                                            </mml:msub>
                                            <mml:mo stretchy="false">)</mml:mo>
                                            <mml:mo stretchy="false">)</mml:mo>
                                        </mml:mrow>
                                        <mml:mrow>
                                            <mml:mn>1</mml:mn>
                                            <mml:mo>/</mml:mo>
                                            <mml:mn>2</mml:mn>
                                        </mml:mrow>
                                    </mml:msup>
                                </mml:mrow>
                            </mml:mfrac>
                        </mml:mrow>
                    </mml:math> (a standardization procedure, and the matrix 
                    <italic toggle="yes">W</italic> containing element 
                    <italic toggle="yes">w
                        <sub>ij</sub>
                    </italic> is called a standardized genotype matrix
                    <sup>
                        <xref ref-type="bibr" rid="ref-8">8</xref>
                    </sup>; 
                    <italic toggle="yes">MAF
                        <sub>j</sub>
                    </italic> &#x2013; a minor allele frequency for 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> genetic variant, and the other values are the same as described above. This strategy allows using defined genetic architecture in a simulated population.</p>
            </sec>
            <sec>
                <title>Continuous phenotype</title>
                <p>Qualitative (continuous) phenotype for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual is simulated according to the linear regression scenario according to the equations (2) or (3) (in case if effect sizes were not supplied).</p>
            </sec>
            <sec>
                <title>Inverse Probability method</title>
                <p>We model a survival phenotype from the proportional hazards model using the inverse probability method
                    <sup>
                        <xref ref-type="bibr" rid="ref-13">13</xref>
                    </sup>: if 
                    <italic toggle="yes">U</italic> is uniform in (0, 1) and if 
                    <italic toggle="yes">S</italic>(&#x00b7;|
                    <bold>z</bold>) is the conditional survival function derived from the proportional hazards model: 
                    <italic toggle="yes">S</italic>(
                    <italic toggle="yes">t</italic>|
                    <bold>z</bold>) = 
                    <italic toggle="yes">e</italic>
                    <sup>&#x2013;
                        <italic toggle="yes">H</italic>
                        <sub>0</sub>(
                        <italic toggle="yes">t</italic>)
                        <italic toggle="yes">e</italic>
                        <sup>
                            <bold>z</bold>
                        </sup>
                    </sup>, then the random variable 
                    <disp-formula id="e4">
                        <mml:math display="block" id="math6">
                            <mml:mrow>
                                <mml:mi>T</mml:mi>
                                <mml:mo>=</mml:mo>
                                <mml:msup>
                                    <mml:mi>S</mml:mi>
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mo>&#x22c5;</mml:mo>
                                <mml:mo stretchy="false">|</mml:mo>
                                <mml:mtext mathvariant="bold">z</mml:mtext>
                                <mml:mo stretchy="false">)</mml:mo>
                                <mml:mo>=</mml:mo>
                                <mml:msub>
                                    <mml:mi>H</mml:mi>
                                    <mml:mn>0</mml:mn>
                                </mml:msub>
                                <mml:msup>
                                    <mml:mrow>
                                        <mml:mo stretchy="false">(</mml:mo>
                                        <mml:mi>t</mml:mi>
                                        <mml:mo stretchy="false">)</mml:mo>
                                    </mml:mrow>
                                    <mml:mrow>
                                        <mml:mo>&#x2212;</mml:mo>
                                        <mml:mn>1</mml:mn>
                                    </mml:mrow>
                                </mml:msup>
                                <mml:mrow>
                                    <mml:mo>(</mml:mo>
                                    <mml:mrow>
                                        <mml:mfrac>
                                            <mml:mrow>
                                                <mml:mo>&#x2212;</mml:mo>
                                                <mml:mtext mathvariant="italic">log</mml:mtext>
                                                <mml:mo stretchy="false">(</mml:mo>
                                                <mml:mi>U</mml:mi>
                                                <mml:mo stretchy="false">)</mml:mo>
                                            </mml:mrow>
                                            <mml:mrow>
                                                <mml:msup>
                                                    <mml:mi>e</mml:mi>
                                                    <mml:mi>z</mml:mi>
                                                </mml:msup>
                                            </mml:mrow>
                                        </mml:mfrac>
                                    </mml:mrow>
                                    <mml:mo>)</mml:mo>
                                </mml:mrow>
                                <mml:mspace width="3em"/>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mn>4</mml:mn>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> has survival function 
                    <italic toggle="yes">S</italic>(&#x00b7;|
                    <bold>z</bold>). In this equation, 
                    <italic toggle="yes">H</italic>
                    <sub>0</sub>(
                    <italic toggle="yes">t</italic>) is a cumulative baseline hazard. By default, we use the Weibull cumulative baseline hazard: 
                    <italic toggle="yes">H</italic>
                    <sub>0</sub>(
                    <italic toggle="yes">t</italic>) = 
                    <italic toggle="yes">&#x03bb;&#x03c1;t</italic>
                    <sup>
                        <italic toggle="yes">&#x03c1;</italic>&#x2013;1</sup>; 
                    <bold>z</bold> is the same parameter that defined above, for each individual, and depends on whether the user provided effect sizes for causal variants or not. We also implemented exponential and Gompertz hazards.</p>
            </sec>
            <sec>
                <title>Linkage Disequilibrium</title>
                <p>The simplest way to simulate collinearity between two SNPs, 
                    <italic toggle="yes">g</italic>
                    <sub>1</sub> and 
                    <italic toggle="yes">g</italic>
                    <sub>2</sub>, with effect sizes 
                    <italic toggle="yes">E</italic>
                    <sub>1</sub> and 
                    <italic toggle="yes">E</italic>
                    <sub>2</sub> is to replace some portion of 
                    <italic toggle="yes">g</italic>
                    <sub>2</sub> with 
                    <italic toggle="yes">g</italic>
                    <sub>1</sub> values according to provided 
                    <mml:math id="math7">
                        <mml:mrow>
                            <mml:msubsup>
                                <mml:mi>r</mml:mi>
                                <mml:mrow>
                                    <mml:mn>12</mml:mn>
                                </mml:mrow>
                                <mml:mn>2</mml:mn>
                            </mml:msubsup>
                        </mml:mrow>
                    </mml:math> coefficient, which reflects a correlation between two SNPs. We also consider applying other techniques, such as copulas, in order to simulate LD.</p>
            </sec>
            <sec>
                <title>Epistatic interactions</title>
                <p>These are modeled with the following equation for 
                    <italic toggle="yes">i
                        <sup>th</sup>
                    </italic> individual: 
                    <disp-formula id="e5">
                        <mml:math display="block" id="math8">
                            <mml:mrow>
                                <mml:msub>
                                    <mml:mi>z</mml:mi>
                                    <mml:mi>i</mml:mi>
                                </mml:msub>
                                <mml:mo>=</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>1</mml:mn>
                                </mml:msub>
                                <mml:msub>
                                    <mml:mi>g</mml:mi>
                                    <mml:mrow>
                                        <mml:mn>1</mml:mn>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>+</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mn>2</mml:mn>
                                </mml:msub>
                                <mml:msub>
                                    <mml:mi>g</mml:mi>
                                    <mml:mrow>
                                        <mml:mn>2</mml:mn>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>+</mml:mo>
                                <mml:msub>
                                    <mml:mi>E</mml:mi>
                                    <mml:mrow>
                                        <mml:mn>12</mml:mn>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:msub>
                                    <mml:mi>g</mml:mi>
                                    <mml:mrow>
                                        <mml:mn>1</mml:mn>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:msub>
                                    <mml:mi>g</mml:mi>
                                    <mml:mrow>
                                        <mml:mn>2</mml:mn>
                                        <mml:mi>i</mml:mi>
                                    </mml:mrow>
                                </mml:msub>
                                <mml:mo>+</mml:mo>
                                <mml:mstyle displaystyle="true">
                                    <mml:munderover>
                                        <mml:mo>&#x2211;</mml:mo>
                                        <mml:mrow>
                                            <mml:mi>j</mml:mi>
                                            <mml:mo>=</mml:mo>
                                            <mml:mn>1</mml:mn>
                                        </mml:mrow>
                                        <mml:mi>k</mml:mi>
                                    </mml:munderover>
                                    <mml:mrow>
                                        <mml:msub>
                                            <mml:mi>&#x03b1;</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                        <mml:msub>
                                            <mml:mi>X</mml:mi>
                                            <mml:mrow>
                                                <mml:mi>j</mml:mi>
                                                <mml:mi>i</mml:mi>
                                            </mml:mrow>
                                        </mml:msub>
                                        <mml:mo>+</mml:mo>
                                        <mml:msub>
                                            <mml:mi>&#x03f5;</mml:mi>
                                            <mml:mi>i</mml:mi>
                                        </mml:msub>
                                    </mml:mrow>
                                </mml:mstyle>
                                <mml:mspace width="3em"/>
                                <mml:mo stretchy="false">(</mml:mo>
                                <mml:mn>5</mml:mn>
                                <mml:mo stretchy="false">)</mml:mo>
                            </mml:mrow>
                        </mml:math>
                    </disp-formula> where the term 
                    <italic toggle="yes">E</italic>
                    <sub>12</sub> 
                    <italic toggle="yes">g</italic>
                    <sub>1
                        <italic toggle="yes">i</italic>
                    </sub> 
                    <italic toggle="yes">g</italic>
                    <sub>2
                        <italic toggle="yes">i</italic>
                    </sub> is the interaction term in which 
                    <italic toggle="yes">E</italic>
                    <sub>12</sub> is the epistatic effect size (user-defined, zero by default); &#x03b1;
                    <italic toggle="yes">
                        <sub>j</sub>
                    </italic> is the effect size for 
                    <italic toggle="yes">j
                        <sup>th</sup>
                    </italic> covariate 
                    <italic toggle="yes">X</italic>.</p>
            </sec>
            <sec>
                <title>Output files</title>
                <p>Output files are in the formats as the direct inputs for the following tools: 
                    <italic toggle="yes">EMMAX</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-14">14</xref>
                    </sup>, 
                    <italic toggle="yes">Blossoc</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-4">4</xref>
                    </sup>, 
                    <italic toggle="yes">Plink</italic> (.ped file), 
                    <italic toggle="yes">QTDT</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-15">15</xref>
                    </sup>, 
                    <italic toggle="yes">TASSEL</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-16">16</xref>
                    </sup>, 
                    <italic toggle="yes">GenABEL</italic>
                    <sup>
                        <xref ref-type="bibr" rid="ref-17">17</xref>
                    </sup> (see 
                    <xref ref-type="table" rid="T1">Table 1</xref>).</p>
                <table-wrap id="T1" orientation="portrait" position="anchor">
                    <label>Table 1. </label>
                    <caption>
                        <title>Output file formats supported by phenotype simulator 
                            <italic toggle="yes">cophesim</italic>.</title>
                        <p>Applying one of the options shown below controls the output format. Each output format has a special suffix type, which defines the file format. These output formats are concordant to those used in 
                            <italic toggle="yes">phenosim</italic>.</p>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1">Application</th>
                                <th align="center" colspan="1" rowspan="1">Option</th>
                                <th align="left" colspan="1" rowspan="1">Commentary</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">EMMAX</td>
                                <td align="center" colspan="1" rowspan="1">-emmax</td>
                                <td align="left" colspan="1" rowspan="1">Suffices .emma_geno, .emma_pheno</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">BlOSSOC</td>
                                <td align="center" colspan="1" rowspan="1">-blossoc</td>
                                <td align="left" colspan="1" rowspan="1">Suffices .blossoc_pos, .blossoc_geno</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">PLINK</td>
                                <td align="center" colspan="1" rowspan="1">-plink</td>
                                <td align="left" colspan="1" rowspan="1">Used by default across all phenotypes,
                                    <break/>except survival. Suffices .ped, .map,
                                    <break/>.pheno.</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">QTDT</td>
                                <td align="center" colspan="1" rowspan="1">-qtdt</td>
                                <td align="left" colspan="1" rowspan="1">Suffices .ped, .map, .dat</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">TASSEL</td>
                                <td align="center" colspan="1" rowspan="1">-tassel</td>
                                <td align="left" colspan="1" rowspan="1">Suffices .poly, .trait</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1">GenABEL</td>
                                <td align="center" colspan="1" rowspan="1">-</td>
                                <td align="left" colspan="1" rowspan="1">This format is used in simulation of
                                    <break/>survival phenotype.</td>
                            </tr>
                        </tbody>
                    </table>
                </table-wrap>
            </sec>
            <sec>
                <title>Operation</title>
                <p>
                    <italic toggle="yes">cophesim</italic> is freely available for download from the following link: 
                    <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim">https://bitbucket.org/izhbannikov/cophesim</ext-link>. Requirements: 
                    <italic toggle="yes">Python</italic> v2.7.10 and newer, 
                    <italic toggle="yes">plinkio</italic> v0.9.6, R v3.2.4 and newer, 
                    <italic toggle="yes">Plink</italic> v1.07, - in order to run the examples. The user manual is provided in a separate file &#x201c;cophesim.pdf&#x201d; located in the program directory and is also available as 
                    <xref ref-type="other" rid="SF2">Supplementary File 1</xref>.</p>
            </sec>
        </sec>
        <sec>
            <title>Use case</title>
            <p>Below we present an example that shows simulation of genetic data and then simulation of three different phenotypic traits. Other examples and installation instructions are provided at the program website and also in the user manual. Refer to the user manual for description of input parameters.</p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;">#-------------------------------------------Example begins---------------------------------------#
#Step 1: genetic data simulation:
plink --simulate-ncases 5000 --simulate-ncontrols 5000 --simulate wgas.sim --out sim.plink --make-bed
#Step 2: Convert .bed to .ped:
plink --bfile sim.plink --recode --out sim.plink
#Step3: phenotype simulation from previously made genetic data:
python cophesim.py -i sim.plink -o testout -itype plink -otype plink -c -ce effects.txt -s -gomp
#-------------------------------------------End of example---------------------------------------#</styled-content>
                </preformat>
            </p>
            <p>In this example, we first (Step 1) simulate genetic data using 
                <italic toggle="yes">Plink</italic>. We simulate 
                <italic toggle="yes">N</italic>.
                <italic toggle="yes">cases</italic> = 
                <italic toggle="yes">N</italic>.
                <italic toggle="yes">control</italic> = 5,000 cases and controls and 1,000 SNPs (defined in 
                <monospace>wgas.sim</monospace> file, refer to the 
                <italic toggle="yes">Plink</italic> website to see documentation for this type of file). Then (Step 2) we convert a binary 
                <monospace>sim.plink.bed</monospace> file to 
                <monospace>sim.plink.ped</monospace> (option 
                <monospace>--recode</monospace> in Plink). This step is not required since cophesim can handle binary 
                <italic toggle="yes">Plink</italic> files (
                <monospace>.bed</monospace>, 
                <monospace>.bim</monospace>, 
                <monospace>.fam</monospace>), but we provide this step in order to show the ability of the program to deal with 
                <italic toggle="yes">Plink</italic> PED format. Finally (Step 3), we simulate dichotomous (by default), continuous (option 
                <monospace>-c</monospace>) and survival (option 
                <monospace>-s</monospace>) traits from previously simulated data stored in files 
                <monospace>sim.plink.ped</monospace> and 
                <monospace>sim.plink.map</monospace>. Note that we simulate survival trait with Gompertz hazard function (option -
                <monospace>gomp</monospace>); effect sizes for causal variants are provided in file 
                <monospace>effects.txt</monospace> (to include this file we use option 
                <monospace>-ce</monospace>).</p>
            <sec>
                <title>ROC curves</title>
                <p>We provide Receiver-Operating Characteristic (ROC) curves (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>) constructed from association tests performed on a simulated dataset. Simulation and association testing were performed with 
                    <italic toggle="yes">Plink</italic> suite. The following parameters were used: 
                    <italic toggle="yes">N</italic> = 10,000 individuals, 
                    <italic toggle="yes">N</italic>.
                    <italic toggle="yes">snp</italic>.
                    <italic toggle="yes">c</italic> = 100 causal, with total 
                    <italic toggle="yes">N</italic>.
                    <italic toggle="yes">snp</italic> = 1,000 variants. Causal variants were labeled with &#x2018;1&#x2019; and the other (neutral) variants were labeled with &#x2018;0&#x2019;. These labels are then used later as true identifiers during calculation of TPR (true positive rate) and FPR (false positive rate). Dichotomous, continuous and survival phenotypic traits were simulated with 
                    <italic toggle="yes">cophesim</italic>. Then association tests were performed with 
                    <italic toggle="yes">Plink</italic> for dichotomous and continuous traits (using 
                    <italic toggle="yes">Plink</italic> flags 
                    <monospace>&#x2013;logistic</monospace> and 
                    <monospace>-regression</monospace>, respectively). Association tests for survival trait were performed with the R package 
                    <italic toggle="yes">GenABEL</italic>. Then calculated 
                    <italic toggle="yes">p</italic>-values provided by association tests for each variant were compared to the significance threshold. Those variants passed the threshold were recognized as causal and associated with simulated phenotype. These classification results later were compared to the true identifiers (defined above) in order to obtain TPR and FPR. For all these tests, we varied the significance threshold from 0 to 1 with the increment of 0.001.</p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>ROC curves constructed from results of association tests performed on a simulated dataset of 
                            <italic toggle="yes">N</italic> = 10,000 individuals, 100 causal and 1,000 of total SNP sites.</title>
                        <p>TPR: True Positive Ratio, FPR: False Positive Ratio. These results were calculated for dichotomous, continuous and survival traits. The dashed, 45 degrees line represents random guessing.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12938/139a6dfe-3ead-4a3c-b6cd-eae867165496_figure2.gif"/>
                </fig>
                <p>The R code to construct ROC curves is provided in the file &#x201c;roc.R&#x201d;. This file is attached to this computer note and also in the data repository: 
                    <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim_data/ROC/roc.R">https://bitbucket.org/izhbannikov/cophesim_data/ROC/roc.R</ext-link>
                </p>
            </sec>
        </sec>
        <sec sec-type="conclusions">
            <title>Conclusion</title>
            <p>In this work we presented the 
                <italic toggle="yes">cophesim</italic> for phenotype simulation from genetic data obtained either from simulation or real data collecting. 
                <italic toggle="yes">cophesim</italic> makes it possible to simulate various demographic models under user-defined scenarios.</p>
        </sec>
        <sec>
            <title>Software and data availability</title>
            <p>Tool and source code available from: 
                <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim">https://bitbucket.org/izhbannikov/cophesim</ext-link>
            </p>
            <p>Archived source code as at time of publication: doi:
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.810195">10.5281/zenodo.810195</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-18">18</xref>
                </sup>
            </p>
            <p>License: MIT</p>
            <p>The example script and output files for the software are available at: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.804090">https://doi.org/10.5281/zenodo.804090</ext-link>
                <sup>
                    <xref ref-type="bibr" rid="ref-19">19</xref>
                </sup>.</p>
            <p>To test the 
                <italic toggle="yes">cophesim</italic> we provided a repository &#x201c;cophesim_data&#x201d;: 
                <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim_data">https://bitbucket.org/izhbannikov/cophesim_data</ext-link>. Download or clone this repository to be able to run tests.</p>
        </sec>
    </body>
    <back>
        <sec id="SM1" sec-type="supplementary-material">
            <title>Supplementary material</title>
            <p id="SF1">Table S1: Best available phenotype/genotype simulation software applications and their comparison to 
                <italic toggle="yes">cophesim</italic> in terms of ability to simulate different types of phenotypic traits. (
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/11968/c65c7ddd-d305-4043-a722-e850f2413f10.docx">https://f1000researchdata.s3.amazonaws.com/supplementary/11968/c65c7ddd-d305-4043-a722-e850f2413f10.docx</ext-link>)</p>
            <p id="SF2">Supplementary File 1: User manual for 
                <italic toggle="yes">cophesim</italic> (
                <ext-link ext-link-type="uri" xlink:href="https://f1000researchdata.s3.amazonaws.com/supplementary/11968/42ab5de2-8130-4b8c-a7ce-abb2f3d55648.pdf">https://f1000researchdata.s3.amazonaws.com/supplementary/11968/42ab5de2-8130-4b8c-a7ce-abb2f3d55648.pdf</ext-link>).</p>
        </sec>
        <ref-list>
            <ref id="ref-1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Liang</surname>
                            <given-names>L</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Z&#x00f6;llner</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Abecasis</surname>
                            <given-names>GR</given-names>
                        </name>
					</person-group>:
                    <article-title>Genome: a rapid coalescent-based whole genome simulator.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2007</year>;<volume>23</volume>(<issue>12</issue>):<fpage>1565</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">17459963</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btm138</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Purcell</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Neale</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Todd-Brown</surname>
                            <given-names>K</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Plink: A tool set for whole-genome association and population-based linkage analyses.</article-title>
                    <source>
						
                        <italic toggle="yes">Am J Hum Genet.</italic>
					</source>
                    <year>2007</year>;<volume>81</volume>(<issue>3</issue>):<fpage>559</fpage>&#x2013;<lpage>575</lpage>.
                    <pub-id pub-id-type="pmid">17701901</pub-id>
                    <pub-id pub-id-type="doi">10.1086/519795</pub-id>
                    <pub-id pub-id-type="pmcid">1950838</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>G&#x00fc;nther</surname>
                            <given-names>T</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Gawenda</surname>
                            <given-names>I</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Schmid</surname>
                            <given-names>KJ</given-names>
                        </name>
					</person-group>:
                    <article-title>phenosim--A software to simulate phenotypes for testing in genome-wide association studies.</article-title>
                    <source>
						
                        <italic toggle="yes">BMC Bioinformatics.</italic>
					</source>
                    <year>2011</year>;<volume>12</volume>(<issue>1</issue>):<fpage>265</fpage>.
                    <pub-id pub-id-type="pmid">21714868</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-12-265</pub-id>
                    <pub-id pub-id-type="pmcid">3150295</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Mailund</surname>
                            <given-names>T</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Schierup</surname>
                            <given-names>MH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Pedersen</surname>
                            <given-names>CN</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Coasim: A flexible environment for simulating genetic data under coalescent models.</article-title>
                    <source>
						
                        <italic toggle="yes">BMC Bioinformatics.</italic>
					</source>
                    <year>2005</year>;<volume>6</volume>(<issue>1</issue>):<fpage>252</fpage>.
                    <pub-id pub-id-type="pmid">16225674</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-6-252</pub-id>
                    <pub-id pub-id-type="pmcid">1274299</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Hoggart</surname>
                            <given-names>CJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Chadeau-Hyam</surname>
                            <given-names>M</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Clark</surname>
                            <given-names>TG</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Sequence-level population simulations over large genomic regions.</article-title>
                    <source>
						
                        <italic toggle="yes">Genetics.</italic>
					</source>
                    <year>2007</year>;<volume>177</volume>(<issue>3</issue>):<fpage>1725</fpage>&#x2013;<lpage>1731</lpage>.
                    <pub-id pub-id-type="pmid">17947444</pub-id>
                    <pub-id pub-id-type="doi">10.1534/genetics.106.069088</pub-id>
                    <pub-id pub-id-type="pmcid">2147962</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <label>6</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Lambert</surname>
                            <given-names>BW</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Terwilliger</surname>
                            <given-names>JD</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Weiss</surname>
                            <given-names>KM</given-names>
                        </name>
					</person-group>:
                    <article-title>
                        <italic toggle="yes">Forsim</italic>: a tool for exploring the genetic architecture of complex traits with controlled truth.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2008</year>;<volume>24</volume>(<issue>16</issue>):<fpage>1821</fpage>&#x2013;<lpage>2</lpage>.
                    <pub-id pub-id-type="pmid">18565989</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btn317</pub-id>
                    <pub-id pub-id-type="pmcid">2732213</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Neuenschwander</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hospital</surname>
                            <given-names>F</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Guillaume</surname>
                            <given-names>F</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>quantinemo: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2008</year>;<volume>24</volume>(<issue>13</issue>):<fpage>1552</fpage>&#x2013;<lpage>3</lpage>.
                    <pub-id pub-id-type="pmid">18450810</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btn219</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Yang</surname>
                            <given-names>J</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>SH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Goddard</surname>
                            <given-names>ME</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Gcta: A tool for genome-wide complex trait analysis.</article-title>
                    <source>
						
                        <italic toggle="yes">Am J Hum Genet.</italic>
					</source>
                    <year>2011</year>;<volume>88</volume>(<issue>1</issue>):<fpage>76</fpage>&#x2013;<lpage>82</lpage>.
                    <pub-id pub-id-type="pmid">21167468</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.ajhg.2010.11.011</pub-id>
                    <pub-id pub-id-type="pmcid">3014363</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Spencer</surname>
                            <given-names>CC</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Su</surname>
                            <given-names>Z</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Donnelly</surname>
                            <given-names>P</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip.</article-title>
                    <source>
						
                        <italic toggle="yes">PLoS Genet.</italic>
					</source>
                    <year>2009</year>;<volume>5</volume>(<issue>5</issue>):<fpage>e1000477</fpage>.
                    <pub-id pub-id-type="pmid">19492015</pub-id>
                    <pub-id pub-id-type="doi">10.1371/journal.pgen.1000477</pub-id>
                    <pub-id pub-id-type="pmcid">2688469</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Chung</surname>
                            <given-names>RH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Shih</surname>
                            <given-names>CC</given-names>
                        </name>
					</person-group>:
                    <article-title>SeqSIMLA: a sequence and phenotype simulation tool for complex disease studies.</article-title>
                    <source>
						
                        <italic toggle="yes">BMC Bioinformatics.</italic>
					</source>
                    <year>2013</year>;<volume>14</volume>(<issue>1</issue>):<fpage>199</fpage>.
                    <pub-id pub-id-type="pmid">23782512</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-14-199</pub-id>
                    <pub-id pub-id-type="pmcid">3693898</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>B</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>G</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Leal</surname>
                            <given-names>SM</given-names>
                        </name>
					</person-group>:
                    <article-title>Simrare: a program to generate and analyze sequence-based data for association studies of quantitative and qualitative traits.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2012</year>;<volume>28</volume>(<issue>20</issue>):<fpage>2703</fpage>&#x2013;<lpage>4</lpage>.
                    <pub-id pub-id-type="pmid">22914216</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts499</pub-id>
                    <pub-id pub-id-type="pmcid">3467746</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Ewing</surname>
                            <given-names>G</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Hermisson</surname>
                            <given-names>J</given-names>
                        </name>
					</person-group>:
                    <article-title>
                        <italic toggle="yes">MSMS</italic>: a coalescent simulation program including recombination, demographic structure and selection at a single locus.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2010</year>;<volume>26</volume>(<issue>16</issue>):<fpage>2064</fpage>&#x2013;<lpage>5</lpage>.
                    <pub-id pub-id-type="pmid">20591904</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btq322</pub-id>
                    <pub-id pub-id-type="pmcid">2916717</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Bender</surname>
                            <given-names>R</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Augustin</surname>
                            <given-names>T</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Blettner</surname>
                            <given-names>M</given-names>
                        </name>
					</person-group>:
                    <article-title>Generating survival times to simulate Cox proportional hazards models.</article-title>
                    <source>
						
                        <italic toggle="yes">Stat Med.</italic>
					</source>
                    <year>2005</year>;<volume>24</volume>(<issue>11</issue>):<fpage>1713</fpage>&#x2013;<lpage>1723</lpage>.
                    <pub-id pub-id-type="pmid">15724232</pub-id>
                    <pub-id pub-id-type="doi">10.1002/sim.2059</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Kang</surname>
                            <given-names>HM</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Sul</surname>
                            <given-names>JH</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Service</surname>
                            <given-names>SK</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>Variance component model to account for sample structure in genome-wide association studies.</article-title>
                    <source>
						
                        <italic toggle="yes">Nat Genet.</italic>
					</source>
                    <year>2010</year>;<volume>42</volume>(<issue>4</issue>):<fpage>348</fpage>&#x2013;<lpage>54</lpage>.
                    <pub-id pub-id-type="pmid">20208533</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ng.548</pub-id>
                    <pub-id pub-id-type="pmcid">3092069</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Abecasis</surname>
                            <given-names>GR</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Cardon</surname>
                            <given-names>LR</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Cookson</surname>
                            <given-names>WO</given-names>
                        </name>
					</person-group>:
                    <article-title>A general test of association for quantitative traits in nuclear families.</article-title>
                    <source>
						
                        <italic toggle="yes">Am J Hum Genet.</italic>
					</source>
                    <year>2000</year>;<volume>66</volume>(<issue>1</issue>):<fpage>279</fpage>&#x2013;<lpage>292</lpage>.
                    <pub-id pub-id-type="pmid">10631157</pub-id>
                    <pub-id pub-id-type="doi">10.1086/302698</pub-id>
                    <pub-id pub-id-type="pmcid">1288332</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Bradbury</surname>
                            <given-names>PJ</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>Z</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Kroon</surname>
                            <given-names>DE</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>TASSEL: software for association mapping of complex traits in diverse samples.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2007</year>;<volume>23</volume>(<issue>19</issue>):<fpage>2633</fpage>&#x2013;<lpage>5</lpage>.
                    <pub-id pub-id-type="pmid">17586829</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btm308</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Aulchenko</surname>
                            <given-names>YS</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Ripke</surname>
                            <given-names>S</given-names>
                        </name>
						
                        <name name-style="western">
                            <surname>Isaacs</surname>
                            <given-names>A</given-names>
                        </name>
						
                        <etal/>
					</person-group>:
                    <article-title>GenABEL: an R library for genome-wide association analysis.</article-title>
                    <source>
						
                        <italic toggle="yes">Bioinformatics.</italic>
					</source>
                    <year>2007</year>;<volume>23</volume>(<issue>10</issue>):<fpage>1294</fpage>&#x2013;<lpage>6</lpage>.
                    <pub-id pub-id-type="pmid">17384015</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btm108</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Zhbannikov</surname>
                            <given-names>I</given-names>
                        </name>
					</person-group>:
                    <article-title>izhbannikov/release-1.4.1.</article-title>
                    <source>
						
                        <italic toggle="yes">Zenodo.</italic>
						</source>
                    <year>2017</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.822163">Data Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">
						
                        <name name-style="western">
                            <surname>Zhbannikov</surname>
                            <given-names>I</given-names>
                        </name>
					</person-group>:
                    <article-title>izhbannikov/cophesim_data: First release.</article-title>
                    <source>
						
                        <italic toggle="yes">Zenodo.</italic>
					</source>
                    <year>2017</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://dx.doi.org/10.5281/zenodo.804090">Data Source</ext-link>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report25816">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12938.r25816</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>R&#x00f6;nneg&#x00e5;rd</surname>
                        <given-names>Lars</given-names>
                    </name>
                    <xref ref-type="aff" rid="r25816a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1057-5401</uri>
                </contrib>
                <contrib contrib-type="author">
                    <name>
                        <surname>Mouresan</surname>
                        <given-names>Elena - Flavia</given-names>
                    </name>
                    <xref ref-type="aff" rid="r25816a2">2</xref>
                    <role>Co-referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1335-7610</uri>
                </contrib>
                <aff id="r25816a1">
                    <label>1</label>Dalarna University, Falun, Sweden</aff>
                <aff id="r25816a2">
                    <label>2</label>Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>18</day>
                <month>9</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 R&#x00f6;nneg&#x00e5;rd L and Mouresan EF</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport25816" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.11968.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Zhbannikov and co-workers present the 
                <italic>cophesim</italic> software that they have developed for simulating phenotypic data using genotype information. The input and output file formats are compatible with many of the most commonly used computer programs for genome wide association studies. The software is flexible, well documented and fills a gap in existing tools, especially for simulating time-to-event phenotypes. The paper is well written and easy to follow, and we only have some minor comments and suggestions.</p>
            <p> Minor comments 
                <list list-type="bullet">
                    <list-item>
                        <p>Closing bracket missing in the sentence below equation (3)</p>
                    </list-item>
                    <list-item>
                        <p>In equation (4), if the user does not provide gene effects then the phenotype is built by the sum of the standardized genotypes for each individual. Could you motivate this choice a bit and explain why it would be useful?</p>
                    </list-item>
                    <list-item>
                        <p>In equation (5), the subscripts look wrong. a_i should be a_j</p>
                    </list-item>
                    <list-item>
                        <p>In the Linkage Disequilibrium section the term &#x201c;copula&#x201d; is used. We do not think most readers of this paper can be expected to be acquainted with copulas and a reference is needed.</p>
                    </list-item>
                </list> Consider adding a short paragraph where you discuss limitations and the possibility to add further functionality in the future, including: 
                <list list-type="bullet">
                    <list-item>
                        <p>Dominance effects</p>
                    </list-item>
                    <list-item>
                        <p>Probit link for binary data</p>
                    </list-item>
                    <list-item>
                        <p>Simulation of correlated traits</p>
                    </list-item>
                    <list-item>
                        <p>Alternative ways to simulate LD including a copula approach</p>
                    </list-item>
                </list> &#x00a0;</p>
            <p> Check that the following link (at the end of the paper) works: 
                <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim_data/ROC/roc.R">https://bitbucket.org/izhbannikov/cophesim_data/ROC/roc.R</ext-link> (we were able to retrieve the code from 
                <ext-link ext-link-type="uri" xlink:href="https://bitbucket.org/izhbannikov/cophesim_data/src/">https://bitbucket.org/izhbannikov/cophesim_data/src/</ext-link>)</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report24718">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.12938.r24718</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Mitnitski</surname>
                        <given-names>Arnold B.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r24718a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r24718a1">
                    <label>1</label>Department of Medicine, Dalhousie University, Halifax, NS, Canada</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>8</day>
                <month>8</month>
                <year>2017</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Mitnitski AB</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport24718" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.11968.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>In the manuscript &#x201c;Cophesim: a comprehensive phenotype simulator for testing novel association methods&#x201d;, I. Zhbannikov and colleagues from the Duke University, presented a software that allowed to generate genotype data prepared with a genetic simulator for the use in the investigations of the genome wide association study (GWAS) tools.&#x00a0;The rational for development of the software is clearly explained. The idea of the study is to use computer simulations to model data with specific assumptions. Similar simulators are known but all of them do not allow simulate survival. There are several other disadvantage with the existing simulators reviewed by the authors.&#x00a0;</p>
            <p> </p>
            <p> The description of the software is technically sound. The methods section is clearly presented. Dichotomous phenotype are simulated according to the logistic model with the covariates being genetic variants and covariates. Continuous phenotypes are simulated using the linear regression. Survival phenotype is modeled using the proportional hazards with inverse probability method.</p>
            <p> </p>
            <p> The details of the code, methods and analysis allow replication of the software and its use by the others. The methods section is clearly presented. Dichotomous phenotype are simulated according to the logistic model with the covariates being genetic variants and covariates. The output formats are compatible with the other applications (Table 1). It is useful example if using the simulator and the other examples are available in the manual. The ROC curve example is also very useful. The information provided is quite sufficient to allow interpretation of the expected results.</p>
            <p> In short, the Cophesim is a useful tool that can be helpful in the genetic analyses.&#x00a0;The article is scientifically sound, the methods are described with details &#x2013; this article will greatly help the researcher interested in the application genetic analyses.</p>
            <p>Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?</p>
            <p>Yes</p>
            <p>Is the rationale for developing the new software tool clearly explained?</p>
            <p>Yes</p>
            <p>Is the description of the software tool technically sound?</p>
            <p>Yes</p>
            <p>Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?</p>
            <p>Yes</p>
            <p>Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
</article>
