<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.8839.3</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Method Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                    <subj-group>
                        <subject>Bioinformatics</subject>
                    </subj-group>
                    <subj-group>
                        <subject>Genomics</subject>
                    </subj-group>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>A cross-package Bioconductor workflow for analysing methylation array data</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 3; peer review: 4 approved]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Maksimovic</surname>
                        <given-names>Jovana</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0002-9458-3061</uri>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Phipson</surname>
                        <given-names>Belinda</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0002-1711-7454</uri>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Oshlack</surname>
                        <given-names>Alicia</given-names>
                    </name>
                    <uri content-type="orcid">https://orcid.org/0000-0001-9788-5690</uri>
                    <xref ref-type="aff" rid="a1">1</xref>
                    <xref ref-type="aff" rid="a2">2</xref>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Murdoch Childrens Research Institute, Royal Children&#x2019;s Hospital, Melbourne, Australia</aff>
                <aff id="a2">
                    <label>2</label>School of BioSciences, University of Melbourne, Melbourne, Australia</aff>
                <aff id="a3">
                    <label>3</label>School of Physics, University of Melbourne, Melbourne, Australia</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:jovana.maksimovic@mcri.edu.au">jovana.maksimovic@mcri.edu.au</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:belinda.phipson@mcri.edu.au">belinda.phipson@mcri.edu.au</email>
                </corresp>
                <fn fn-type="con">
                    <p>JM and BP designed the content and wrote the paper. AO oversaw the project and contributed to the writing and editing of the paper.</p>
                </fn>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>5</day>
                <month>4</month>
                <year>2017</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2016</year>
            </pub-date>
            <volume>5</volume>
            <elocation-id>1281</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>3</day>
                    <month>4</month>
                    <year>2017</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2017 Maksimovic J et al.</copyright-statement>
                <copyright-year>2017</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/5-1281/pdf"/>
            <abstract>
                <p>Methylation in the human genome is known to be associated with development and disease. The Illumina Infinium methylation arrays are by far the most common way to interrogate methylation across the human genome. This paper provides a Bioconductor workflow using multiple packages for the analysis of methylation array data. Specifically, we demonstrate the steps involved in a typical differential methylation analysis pipeline including: quality control, filtering, normalization, data exploration and statistical testing for probe-wise differential methylation. We further outline other analyses such as differential methylation of regions, differential variability analysis, estimating cell type composition and gene ontology testing. Finally, we provide some examples of how to visualise methylation array data.</p>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>methylation</kwd>
                <kwd>bioconductor</kwd>
                <kwd>workflow</kwd>
                <kwd>array</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>AO was supported by an NHMRC Career Development Fellowship APP1051481.</funding-statement>
                <funding-statement>
                    <italic>The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</italic>
                </funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 2</title>
                <p>We have fixed bugs in the workflow code that were introduced with changes to the arguments required by&#x00a0;the "cpg.annotate" and "DMR.plot" functions in the 
                    <italic>DMRcate</italic> package. Figure 11 once again shows the top ranked DMR, as in version 1 of the workflow. A bug present in the 
                    <italic>minfi</italic>&#x00a0;"preprocessQuantile" function during the generation of version 2 necessitated&#x00a0;us to plot DMR 11 in order to show all of the features we were trying to display; this has now been fixed by the 
                    <italic>minfi</italic>&#x00a0;authors. We have changed the links to the data on figshare to a modified version of the dataset which now includes CpG islands and DNAse I hypersensitive sites for chr17 instead of chr22, to reflect the change to the DMR that is being shown in Figure 11. We have also included a link to a live&#x00a0;(as opposed to static) version of this workflow on the Bioconductor website.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec sec-type="intro">
            <title>Introduction</title>
            <p>DNA methylation, the addition of a methyl group to a CG dinucleotide of the DNA, is the most extensively studied epigenetic mark due to its role in both development and disease (
                <xref ref-type="bibr" rid="ref-6">Bird, 2002</xref>; 
                <xref ref-type="bibr" rid="ref-20">Laird, 2003</xref>). Although DNA methylation can be measured in several ways, the epigenetics community has enthusiastically embraced the Illumina HumanMethylation450 (450k) array (
                <xref ref-type="bibr" rid="ref-4">Bibikova 
                    <italic toggle="yes">et al.</italic>, 2011</xref>) as a cost-effective way to assay methylation across the human genome. More recently, Illumina has increased the genomic coverage of the platform to &gt;850,000 sites with the release of their MethylationEPIC (850k) array. As methylation arrays are likely to remain popular for measuring methylation for the foreseeable future, it is necessary to provide robust workflows for methylation array analysis.</p>
            <p>Measurement of DNA methylation by Infinium technology (Infinium I) was first employed by Illumina on the HumanMethylation27 (27k) array (
                <xref ref-type="bibr" rid="ref-5">Bibikova 
                    <italic toggle="yes">et al.</italic>, 2009</xref>), which measured methylation at approximately 27,000 CpGs, primarily in gene promoters. Like bisulfite sequencing, the Infinium assay detects methylation status at single base resolution. However, due to its relatively limited coverage the array platform was not truly considered &#x201c;genome-wide&#x201d; until the arrival of the 450k array. The 450k array increased the genomic coverage of the platform to over 450,000 gene-centric sites by combining the original Infinium I assay with the novel Infinium II probes. Both assay types employ 50bp probes that query a [C/T] polymorphism created by bisulfite conversion of unmethylated cytosines in the genome, however, the Infinium I and II assays differ in the number of beads required to detect methylation at a single locus. Infinium I uses two bead types per CpG, one for each of the methylated and unmethylated states (
                <xref ref-type="fig" rid="f1">Figure 1a</xref>). In contrast, the Infinium II design uses one bead type and the methylated state is determined at the single base extension step after hybridization (
                <xref ref-type="fig" rid="f1">Figure 1b</xref>). The 850k array also uses a combination of the Infinium I and II assays but achieves additional coverage by increasing the size of each array; a 450k slide contains 12 arrays whilst the 850k has only 8.</p>
            <p>Regardless of the Illumina array version, for each CpG, there are two measurements: a methylated intensity (denoted by 
                <italic toggle="yes">M</italic>) and an unmethylated intensity (denoted by 
                <italic toggle="yes">U</italic>). These intensity values can be used to determine the proportion of methylation at each CpG locus. Methylation levels are commonly reported as either beta values (
                <italic toggle="yes">&#x03b2;</italic> = 
                <italic toggle="yes">M/</italic>(
                <italic toggle="yes">M</italic> + 
                <italic toggle="yes">U</italic>)) or M-values (
                <italic toggle="yes">M value</italic> = 
                <italic toggle="yes">log</italic>2(
                <italic toggle="yes">M/U</italic>)). For practical purposes, a small offset, 
                <italic toggle="yes">&#x03b1;</italic>, can be added to the denominator of the 
                <italic toggle="yes">&#x03b2;</italic> value equation to avoid dividing by small values, which is the default behaviour of the 
                <monospace>getBeta</monospace> function in 
                <italic toggle="yes">minfi</italic>. The default value for 
                <italic toggle="yes">&#x03b1;</italic> is 100. It may also be desirable to add a small offset to the numerator and denominator when calculating M-values to avoid dividing by zero in rare cases, however the default 
                <monospace>getM</monospace> function in 
                <italic toggle="yes">minfi</italic> does not do this. Beta values and M-values are related through a logit transformation. Beta values are generally preferable for describing the level of methylation at a locus or for graphical presentation because percentage methylation is easily interpretable. However, due to their distributional properties, M-values are more appropriate for statistical testing (
                <xref ref-type="bibr" rid="ref-10">Du 
                    <italic toggle="yes">et al.</italic>, 2010</xref>).</p>
            <p>In this workflow, we will provide examples of the steps involved in analysing methylation array data using R (
                <xref ref-type="bibr" rid="ref-31">R Core Team, 2014</xref>) and Bioconductor (
                <xref ref-type="bibr" rid="ref-17">Huber 
                    <italic toggle="yes">et al.</italic>, 2015</xref>), including: quality control, filtering, normalisation, data exploration and probe-wise differential methylation analysis. We will also cover other approaches such as differential methylation analysis of regions, differential variability analysis, gene ontology analysis and estimating cell type composition. Finally, we will provide some examples of useful ways to visualise methylation array data.</p>
        </sec>
        <sec>
            <title>Differential methylation analysis</title>
            <sec>
                <title>Obtaining the data</title>
                <p>All of the data used in this workflow can be downloaded and extracted in R using the 
                    <monospace>download.file</monospace> and 
                    <monospace>untar</monospace> functions, as shown below. Alternatively, the data can be manually downloaded from: 
                    <ext-link ext-link-type="uri" xlink:href="https://figshare.com/articles/methylAnalysisDataV3_tar_gz/4800970">https://figshare.com/articles/methylAnalysisDataV3_tar_gz/4800970</ext-link>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># the URL for the data download</styled-content>

                        <styled-content style="font-size:15px;">url &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"
                            <ext-link ext-link-type="uri" xlink:href="https://ndownloader.figshare.com/files/7896205">https://ndownloader.figshare.com/files/7896205</ext-link>"</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;">
                            <italic toggle="yes"># download the data</italic>
                        </styled-content>

                        <styled-content style="font-size:15px;">if(!</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">file.exists</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"methylAnalysisDataV3.tar.gz"</styled-content>
                        <styled-content style="font-size:15px;">)){</styled-content>
    
                        <styled-content style="font-size:15px;color:#214A87;">download.file</styled-content>
                        <styled-content style="font-size:15px;">(url,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">destfile=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"methylAnalysisDataV3.tar.gz"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">method=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"auto"</styled-content>
                        <styled-content style="font-size:15px;">)
}</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;">
                            <italic toggle="yes"># extract the data</italic>
                        </styled-content>

                        <styled-content style="font-size:15px;">if(!</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">file.exists</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"./data"</styled-content>
                        <styled-content style="font-size:15px;">)){</styled-content>
    
                        <styled-content style="font-size:15px;color:#214A87;">untar</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"methylAnalysisDataV3.tar.gz"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">exdir=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"."</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">compressed=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"gzip"</styled-content>
                        <styled-content style="font-size:15px;">)
}</styled-content>
                    </preformat>
                </p>
                <p>Once the data has been downloaded and extracted, there should be a folder called 
                    <monospace>data</monospace> that contains all the files necessary to execute the workflow.</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>Figure 1. </label>
                    <caption>
                        <title>Illumina Infinium HumanMethylation450 assay, reproduced from 
                            <xref ref-type="bibr" rid="ref-24">Maksimovic 
                                <italic toggle="yes">et al.</italic>, 2012</xref>.</title>
                        <p>(
                            <bold>a</bold>) Infinium I assay. Each individual CpG is interrogated using two bead types: methylated (M) and unmethylated (U). Both bead types will incorporate the same labeled nucleotide for the same target CpG, thereby producing the same color fluorescence. The nucleotide that is added is determined by the base downstream of the &#x201c;C&#x201d; of the target CpG. The proportion of methylation can be calculated by comparing the intensities from the two different probes in the same color. (
                            <bold>b</bold>) Infinium II assay. Each target CpG is interrogated using a single bead type. Methylation state is detected by single base extension at the position of the &#x201c;C&#x201d; of the target CpG, which always results in the addition of a labeled &#x201c;G&#x201d; or &#x201c;A&#x201d; nucleotide, complementary to either the &#x201c;methylated&#x201d; C or &#x201c;unmethylated&#x201d; T, respectively. Each locus is detected in two colors, and methylation status is determined by comparing the two colors from the one position.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure1.gif"/>
                </fig>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># set up a path to the data directory</styled-content>

                        <styled-content style="font-size:15px;">dataDirectory &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"./data"</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># list the files</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">list.files</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">recursive=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##  [1] "48639-non-specific-probes-Illumina450k.csv"
##  [2] "5975827018/5975827018_R06C02_Grn.idat"
##  [3] "5975827018/5975827018_R06C02_Red.idat"
##  [4] "6264509100/6264509100_R01C01_Grn.idat"
##  [5] "6264509100/6264509100_R01C01_Red.idat"
##  [6] "6264509100/6264509100_R01C02_Grn.idat"
##  [7] "6264509100/6264509100_R01C02_Red.idat"
##  [8] "6264509100/6264509100_R02C01_Grn.idat"
##  [9] "6264509100/6264509100_R02C01_Red.idat"
## [10] "6264509100/6264509100_R02C02_Grn.idat"
## [11] "6264509100/6264509100_R02C02_Red.idat"
## [12] "6264509100/6264509100_R03C01_Grn.idat"
## [13] "6264509100/6264509100_R03C01_Red.idat"
## [14] "6264509100/6264509100_R03C02_Grn.idat"
## [15] "6264509100/6264509100_R03C02_Red.idat"
## [16] "6264509100/6264509100_R04C01_Grn.idat"
## [17] "6264509100/6264509100_R04C01_Red.idat"
## [18] "6264509100/6264509100_R04C02_Grn.idat"
## [19] "6264509100/6264509100_R04C02_Red.idat"
## [20] "6264509100/6264509100_R05C01_Grn.idat"
## [21] "6264509100/6264509100_R05C01_Red.idat"
## [22] "6264509100/6264509100_R05C02_Grn.idat"
## [23] "6264509100/6264509100_R05C02_Red.idat"
## [24] "6264509100/6264509100_R06C01_Grn.idat"
## [25] "6264509100/6264509100_R06C01_Red.idat"
## [26] "6264509100/6264509100_R06C02_Grn.idat"
## [27] "6264509100/6264509100_R06C02_Red.idat"
## [28] "ageData.RData"
## [29] "human_c2_v5.rdata"
## [30] "model-based-cpg-islands-hg19-chr17.txt"
## [31] "SampleSheet.csv"
## [32] "wgEncodeRegDnaseClusteredV3chr17.bed"</styled-content>
                    </preformat>
                </p>
                <p>To demonstrate the various aspects of analysing methylation data, we will be using a small, publicly available 450k methylation dataset (GSE49667) (
                    <xref ref-type="bibr" rid="ref-43">Zhang 
                        <italic toggle="yes">et al.</italic>, 2013</xref>). The dataset contains 10 samples in total: there are 4 different sorted T-cell types (naive, rTreg, act_naive, act_rTreg, collected from 3 different individuals (M28, M29, M30). For details describing sample collection and preparation, see 
                    <xref ref-type="bibr" rid="ref-43">Zhang 
                        <italic toggle="yes">et al.</italic> (2013)</xref>. An additional 
                    <monospace>birth</monospace> sample (individual VICS-72098-18-B) is included from another study (GSE51180) (
                    <xref ref-type="bibr" rid="ref-8">Cruickshank 
                        <italic toggle="yes">et al.</italic>, 2013</xref>) to illustrate approaches for identifying and excluding poor quality samples.</p>
                <p>There are several R Bioconductor packages available that have been developed for analysing methylation array data, including 
                    <italic toggle="yes">minfi</italic> (
                    <xref ref-type="bibr" rid="ref-1">Aryee 
                        <italic toggle="yes">et al.</italic>, 2014</xref>), 
                    <italic toggle="yes">missMethyl</italic> (
                    <xref ref-type="bibr" rid="ref-28">Phipson 
                        <italic toggle="yes">et al.</italic>, 2016</xref>), 
                    <italic toggle="yes">wateRmelon</italic> (
                    <xref ref-type="bibr" rid="ref-30">Pidsley 
                        <italic toggle="yes">et al.</italic>, 2013</xref>), 
                    <italic toggle="yes">methylumi</italic> (
                    <xref ref-type="bibr" rid="ref-9">Davis 
                        <italic toggle="yes">et al.</italic>, 2015</xref>), 
                    <italic toggle="yes">ChAMP</italic> (
                    <xref ref-type="bibr" rid="ref-26">Morris 
                        <italic toggle="yes">et al.</italic>, 2014</xref>) and 
                    <italic toggle="yes">charm</italic> (
                    <xref ref-type="bibr" rid="ref-2">Aryee 
                        <italic toggle="yes">et al.</italic>, 2011</xref>). Some of the packages, such as 
                    <italic toggle="yes">minfi</italic> and 
                    <italic toggle="yes">methylumi</italic> include a framework for reading in the raw data from IDAT files and various specialised objects for storing and manipulating the data throughout the course of an analysis. Other packages provide specialised analysis methods for normalisation and statistical testing that rely on either 
                    <italic toggle="yes">minfi</italic> or 
                    <italic toggle="yes">methylumi</italic> objects. It is possible to convert between 
                    <italic toggle="yes">minfi</italic> and 
                    <italic toggle="yes">methylumi</italic> data types, however, this is not always trivial. Thus, it is advisable to consider the methods that you are interested in using and the data types that are most appropriate before you begin your analysis. Another popular method for analysing methylation array data is 
                    <italic toggle="yes">limma</italic> (
                    <xref ref-type="bibr" rid="ref-32">Ritchie 
                        <italic toggle="yes">et al.</italic>, 2015</xref>), which was originally developed for gene expression microarray analysis. As 
                    <italic toggle="yes">limma</italic> operates on a matrix of values, it is easily applied to any data that can be converted to a 
                    <monospace>matrix</monospace> in R. For a complete list of Bioconductor packages for analysing DNA methylation data, one can search for &#x201c;DNAMethylation&#x201d; in BiocViews (
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/release/BiocViews.html#___DNAMethylation">https://www.bioconductor.org/packages/release/BiocViews.html#___DNAMethylation</ext-link>) on the Bioconductor website.</p>
                <p>We will begin with an example of a 
                    <bold>probe-wise</bold> differential methylation analysis using 
                    <italic toggle="yes">minfi</italic> and 
                    <italic toggle="yes">limma</italic>. By 
                    <bold>probe-wise</bold> analysis we mean each individual CpG probe will be tested for differential methylation for the comparisons of interest and p-values and moderated t-statistics (
                    <xref ref-type="bibr" rid="ref-34">Smyth, 2004</xref>) will be generated for each CpG probe.</p>
            </sec>
            <sec>
                <title>Loading the data</title>
                <p>It is useful to begin an analysis in R by loading all the packages that are likely to be required.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># load packages required for analysis</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(limma)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(minfi)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(IlluminaHumanMethylation450kanno.ilmn12.hg19)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(IlluminaHumanMethylation450kmanifest)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(RColorBrewer)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(missMethyl)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(matrixStats)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(minfiData)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(Gviz)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(DMRcate)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(stringr)</styled-content>
                    </preformat>
                </p>
                <p>The 
                    <italic toggle="yes">minfi</italic>, 
                    <italic toggle="yes">IlluminaHumanMethylation450kanno.ilmn12.hg19</italic>, 
                    <italic toggle="yes">IlluminaHumanMethylation450kmanifest</italic>, 
                    <italic toggle="yes">missMethyl</italic>, 
                    <italic toggle="yes">minfiData</italic> and 
                    <italic toggle="yes">DMRcate</italic> are methylation specific packages, while 
                    <italic toggle="yes">RColorBrewer</italic> and 
                    <italic toggle="yes">Gviz</italic> are visualisation packages. We use 
                    <italic toggle="yes">limma</italic> for testing differential methylation, and 
                    <italic toggle="yes">matrixStats</italic> and 
                    <italic toggle="yes">stringr</italic> have functions used in the workflow. The 
                    <italic toggle="yes">IlluminaHumanMethylation450kmanifest</italic> package provides the Illumina manifest as an R object which can easily be loaded into the environment. The manifest contains all of the annotation information for each of the CpG probes on the 450k array. This is useful for determining where any differentially methylated probes are located in a genomic context.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># get the 450k annotation data</styled-content>

                        <styled-content style="font-size:15px;">ann450k =</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getAnnotation</styled-content>
                        <styled-content style="font-size:15px;">(IlluminaHumanMethylation450kanno.ilmn12.hg19)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>(ann450k)</preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## DataFrame with 6 rows and 33 columns</styled-content>

                        <styled-content style="font-size:15px;">##		      chr	pos	 strand	       Name    AddressA</styled-content>

                        <styled-content style="font-size:15px;">##	      &lt;character&gt; &lt;integer&gt; &lt;character&gt; &lt;character&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	     chrY   9363356	      -	 cg00050873    32735311</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	     chrY  21239348	      -	 cg00212031    29674443</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748	     chrY   8148233	      -	 cg00213748    30703409</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	     chrY  15815688	      -  cg00214611    69792329</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876	     chrY   9385539	      -  cg00455876    27653438</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	     chrY   6778695	      +  cg01707559    45652402</styled-content>

                        <styled-content style="font-size:15px;">##	         AddressB	                                   ProbeSeqA</styled-content>

                        <styled-content style="font-size:15px;">##	      &lt;character&gt;	                                 &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	 31717405 ACAAAAAAACAACACACAACTATAATAATTTTTAAAATAAATAAACCCCA</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	 38703326 CCCAATTAACCACAAAAACTAAACAAATTATACAATCAAAAAAACATACA</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748	 36767301 TTTTAACACCTAACACCATTTTAACAATAAAAATTCTACAAAAAAAAACA</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	 46723459 CTAACTTCCAAACCACACTTTATATACTAAACTACAATATAACACAAACA</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876	 69732350 AACTCTAAACTACCCAACACAAACTCCAAAAACTTCTCAAAAAAAACTCA</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	 64689504 ACAAATTAAAAACACTAAAACAAACACAACAACTACAACAACAAAAAACA</styled-content>

                        <styled-content style="font-size:15px;">##	                                               ProbeSeqB	Type</styled-content>

                        <styled-content style="font-size:15px;">##	                                             &lt;character&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873 ACGAAAAAACAACGCACAACTATAATAATTTTTAAAATAAATAAACCCCG           I</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031 CCCAATTAACCGCAAAAACTAAACAAATTATACGATCGAAAAAACGTACG           I</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748 TTTTAACGCCTAACACCGTTTTAACGATAAAAATTCTACAAAAAAAAACG           I</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611 CTAACTTCCGAACCGCGCTTTATATACTAAACTACAATATAACGCGAACG           I</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876 AACTCTAAACTACCCGACACAAACTCCAAAAACTTCTCGAAAAAAACTCG           I</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559 GCGAATTAAAAACACTAAAACGAACGCGACGACTACAACGACAAAAAACG           I</styled-content>

                        <styled-content style="font-size:15px;">##	         NextBase	Color	 Probe_rs Probe_maf	 CpG_rs</styled-content>

                        <styled-content style="font-size:15px;">##	      &lt;character&gt; &lt;character&gt; &lt;character&gt; &lt;numeric&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	        A	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	        T	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748	        A	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	        A	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876	        A	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	        A	  Red	       NA	 NA	     NA</styled-content>

                        <styled-content style="font-size:15px;">##	        CpG_maf	     SBE_rs   SBE_maf           Islands_Name</styled-content>

                        <styled-content style="font-size:15px;">##	      &lt;numeric&gt; &lt;character&gt; &lt;numeric&gt;	         &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	     NA	         NA	   NA   chrY:9363680-9363943</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	     NA	         NA	   NA chrY:21238448-21240005</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748	     NA	         NA	   NA   chrY:8147877-8148210</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	     NA	         NA	   NA chrY:15815488-15815779</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876	     NA	         NA	   NA   chrY:9385471-9385777</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	     NA	         NA	   NA   chrY:6778574-6780028</styled-content>

                        <styled-content style="font-size:15px;">##	      Relation_to_Island</styled-content>

                        <styled-content style="font-size:15px;">##	             &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	         N_Shore</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	          Island</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748	         S_Shore</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	          Island</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876	          Island</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	          Island</styled-content>

                        <styled-content style="font-size:15px;">##</styled-content>

                        <styled-content style="font-size:15px;">##	      Forward_Sequence</styled-content>

                        <styled-content style="font-size:15px;">##	           &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873 TATCTCTGTCTGGCGAGGAGGCAACGCACAACTGTGGTGGTTTTTGGAGTGGGTGGACCC[CG]</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031 CCATTGGCCCGCCCCAGTTGGCCGCAGGGACTGAGCAAGTTATGCGGTCGGGAAGACGTG[CG]</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748 TCTGTGGGACCATTTTAACGCCTGGCACCGTTTTAACGATGGAGGTTCTGCAGGAGGGGG[CG]</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611 GCGCCGGCAGGACTAGCTTCCGGGCCGCGCTTTGTGTGCTGGGCTGCAGTGTGGCGCGGG[CG]</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876 CGCGTGTGCCTGGACTCTGAGCTACCCGGCACAAGCTCCAAGGGCTTCTCGGAGGAGGCT[CG]</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559 AGCGGCCGCTCCCAGTGGTGGTCACCGCCAGTGCCAATCCCTTGCGCCGCCGTGCAGTCC[CG]</styled-content>

                        <styled-content style="font-size:15px;">##	                                               SourceSeq Random_Loci</styled-content>

                        <styled-content style="font-size:15px;">##	                                             &lt;character&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873 CGGGGTCCACCCACTCCAAAAACCACCACAGTTGTGCGTTGCCTCCTCGC</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031 CGCACGTCTTCCCGACCGCATAACTTGCTCAGTCCCTGCGGCCAACTGGG</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748 CGCCCCCTCCTGCAGAACCTCCATCGTTAAAACGGTGCCAGGCGTTAAAA</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611 CGCCCGCGCCACACTGCAGCCCAGCACACAAAGCGCGGCCCGGAAGCTAG</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876 GACTCTGAGCTACCCGGCACAAGCTCCAAGGGCTTCTCGGAGGAGGCTCG</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559 CGCCCTCTGTCGCTGCAGCCGCCGCGCCCGCTCCAGTGCCCCCAATTCGC</styled-content>

                        <styled-content style="font-size:15px;">##	      Methyl27_Loci UCSC_RefGene_Name	     UCSC_RefGene_Accession</styled-content>

                        <styled-content style="font-size:15px;">##	        &lt;character&gt;       &lt;character&gt;	                &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873	               TSPY4;FAM197Y2	     NM_001164471;NR_001553</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	                       TTTY14	                  NR_001543</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	                TMSB4Y;TMSB4Y	        NM_004202;NM_004202</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559	            TBL1Y;TBL1Y;TBL1Y NM_134259;NM_033284;NM_134258</styled-content>

                        <styled-content style="font-size:15px;">##	        UCSC_RefGene_Group     Phantom	       DMR    Enhancer</styled-content>

                        <styled-content style="font-size:15px;">##	               &lt;character&gt; &lt;character&gt; &lt;character&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873         Body;TSS1500</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031	            TSS200</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611	     1stExon;5'UTR</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559 TSS200;TSS200;TSS200</styled-content>

                        <styled-content style="font-size:15px;">##	               HMM_Island Regulatory_Feature_Name</styled-content>

                        <styled-content style="font-size:15px;">##                    &lt;character&gt;	      &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873   Y:9973136-9976273</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031 Y:19697854-19699393</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748   Y:8207555-8208234</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611 Y:14324883-14325218     Y:15815422-15815706</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876   Y:9993394-9995882</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559   Y:6838022-6839951</styled-content>

                        <styled-content style="font-size:15px;">##	                    Regulatory_Feature_Group	     DHS</styled-content>

                        <styled-content style="font-size:15px;">##	                                 &lt;character&gt; &lt;character&gt;</styled-content>

                        <styled-content style="font-size:15px;">## cg00050873</styled-content>

                        <styled-content style="font-size:15px;">## cg00212031</styled-content>

                        <styled-content style="font-size:15px;">## cg00213748</styled-content>

                        <styled-content style="font-size:15px;">## cg00214611 Promoter_Associated_Cell_type_specific</styled-content>

                        <styled-content style="font-size:15px;">## cg00455876</styled-content>

                        <styled-content style="font-size:15px;">## cg01707559</styled-content>
                    </preformat>
                </p>
                <p>As for their many other BeadArray platforms, Illumina methylation data is usually obtained in the form of Intensity Data (IDAT) Files. This is a proprietary format that is output by the scanner and stores summary intensities for each probe on the array. However, there are Bioconductor packages available that facilitate the import of data from IDAT files into R (
                    <xref ref-type="bibr" rid="ref-33">Smith 
                        <italic toggle="yes">et al.</italic>, 2013</xref>). Typically, each IDAT file is approximately 8MB in size. The simplest way to import the raw methylation data into R is using the 
                    <italic toggle="yes">minfi</italic> function 
                    <monospace>read.metharray.sheet</monospace>, along with the path to the IDAT files and a sample sheet. The sample sheet is a CSV (comma-separated) file containing one line per sample, with a number of columns describing each sample. The format expected by the 
                    <monospace>read.metharray.sheet</monospace> function is based on the sample sheet file that usually accompanies Illumina methylation array data. It is also very similar to the targets file described by the 
                    <italic toggle="yes">limma</italic> package. Importing the sample sheet into R creates a 
                    <monospace>data.frame</monospace> with one row for each sample and several columns. The 
                    <monospace>read.metharray.sheet</monospace> function uses the specified path and other information from the sample sheet to create a column called 
                    <monospace>Basename</monospace> which specifies the location of each individual IDAT file in the experiment.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># read in the sample sheet for the experiment</styled-content>

                        <styled-content style="font-size:15px;">targets &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">read.metharray.sheet</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pattern=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"SampleSheet.csv"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## [read.metharray.sheet] Found the following CSV files:</styled-content>


                        <styled-content style="font-size:15px;">## [1] "./data/SampleSheet.csv"</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">targets</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##    Sample_Name  Sample_Well   Sample_Source Sample_Group Sample_Label</styled-content>

                        <styled-content style="font-size:15px;">## 1            1           A1             M28        naive        naive</styled-content>

                        <styled-content style="font-size:15px;">## 2            2           B1             M28        rTreg        rTreg</styled-content>

                        <styled-content style="font-size:15px;">## 3            3           C1             M28    act_naive    act_naive</styled-content>

                        <styled-content style="font-size:15px;">## 4            4           D1             M29        naive        naive</styled-content>

                        <styled-content style="font-size:15px;">## 5            5           E1             M29    act_naive    act_naive</styled-content>

                        <styled-content style="font-size:15px;">## 6            6           F1             M29    act_rTreg    act_rTreg</styled-content>

                        <styled-content style="font-size:15px;">## 7            7           G1             M30        naive        naive</styled-content>

                        <styled-content style="font-size:15px;">## 8            8           H1             M30        rTreg        rTreg</styled-content>

                        <styled-content style="font-size:15px;">## 9            9           A2             M30    act_naive    act_naive</styled-content>

                        <styled-content style="font-size:15px;">## 10          10           B2             M30    act_rTreg    act_rTreg</styled-content>

                        <styled-content style="font-size:15px;">## 11          11          H06 VICS-72098-18-B        birth        birth</styled-content>

                        <styled-content style="font-size:15px;">##      Pool_ID   Array      Slide                            Basename</styled-content>

                        <styled-content style="font-size:15px;">## 1 	   &lt;NA&gt;  R01C01 6264509100 ./data/6264509100/6264509100_R01C01</styled-content>

                        <styled-content style="font-size:15px;">## 2 	   &lt;NA&gt;  R02C01 6264509100 ./data/6264509100/6264509100_R02C01</styled-content>

                        <styled-content style="font-size:15px;">## 3 	   &lt;NA&gt;  R03C01 6264509100 ./data/6264509100/6264509100_R03C01</styled-content>

                        <styled-content style="font-size:15px;">## 4 	   &lt;NA&gt;  R04C01 6264509100 ./data/6264509100/6264509100_R04C01</styled-content>

                        <styled-content style="font-size:15px;">## 5 	   &lt;NA&gt;  R05C01 6264509100 ./data/6264509100/6264509100_R05C01</styled-content>

                        <styled-content style="font-size:15px;">## 6 	   &lt;NA&gt;  R06C01 6264509100 ./data/6264509100/6264509100_R06C01</styled-content>

                        <styled-content style="font-size:15px;">## 7 	   &lt;NA&gt;  R01C02 6264509100 ./data/6264509100/6264509100_R01C02</styled-content>

                        <styled-content style="font-size:15px;">## 8 	   &lt;NA&gt;  R02C02 6264509100 ./data/6264509100/6264509100_R02C02</styled-content>

                        <styled-content style="font-size:15px;">## 9 	   &lt;NA&gt;  R03C02 6264509100 ./data/6264509100/6264509100_R03C02</styled-content>

                        <styled-content style="font-size:15px;">## 10 	   &lt;NA&gt;  R04C02 6264509100 ./data/6264509100/6264509100_R04C02</styled-content>

                        <styled-content style="font-size:15px;">## 11 	   &lt;NA&gt;  R06C02 5975827018 ./data/5975827018/5975827018_R06C02</styled-content>
                    </preformat>
                </p>
                <p>Now that we have imported the information about the samples and where the data is located, we can read the raw intensity signals into R from the IDAT files using the 
                    <monospace>read.metharray.exp</monospace> function. This creates an 
                    <monospace>RGChannelSet</monospace> object that contains all the raw intensity data, from both the red and green colour channels, for each of the samples. At this stage, it can be useful to rename the samples with more descriptive names.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># read in the raw data from the IDAT files</styled-content>

                        <styled-content style="font-size:15px;">rgSet &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">read.metharray.exp</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87">targets=</styled-content>
                        <styled-content style="font-size:15px;">targets)
rgSet</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## RGChannelSet (storageMode: lockedEnvironment)
## assayData: 622399 features, 11 samples
##   element names: Green, Red
## An object of class 'AnnotatedDataFrame'
##   sampleNames: 6264509100_R01C01 6264509100_R02C01 ...
##     5975827018_R06C02 (11 total)
##   varLabels: Sample_Name Sample_Well ... filenames (10 total)
##   varMetadata: labelDescription
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># give the samples descriptive names</styled-content>

                        <styled-content style="font-size:15px;">targets$ID &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">paste</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group,targets$Sample_Name,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"."</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">sampleNames</styled-content>
                        <styled-content style="font-size:15px;">(rgSet) &lt;- targets$ID</styled-content>

                        <styled-content style="font-size:15px;">rgSet</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;">## RGChannelSet (storageMode: lockedEnvironment)
## assayData: 622399 features, 11 samples
##   element names: Green, Red
## An object of class 'AnnotatedDataFrame'
##   sampleNames: naive.1 rTreg.2 ... birth.11 (11 total)
##   varLabels: Sample_Name Sample_Well ... filenames (10 total)
##   varMetadata: labelDescription
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Quality control</title>
                <p>Once the data has been imported into R, we can evaluate its quality. Firstly, we need to calculate detection p-values. We can generate a detection p-value for every CpG in every sample, which is indicative of the quality of the signal. The method used by 
                    <italic toggle="yes">minfi</italic> to calculate detection p-values compares the total signal (
                    <italic toggle="yes">M</italic> + 
                    <italic toggle="yes">U</italic>) for each probe to the background signal level, which is estimated from the negative control probes. Very small p-values are indicative of a reliable signal whilst large p-values, for example &gt;0.01, generally indicate a poor quality signal.</p>
                <p>Plotting the mean detection p-value for each sample allows us to gauge the general quality of the samples in terms of the overall signal reliability (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>). Samples that have many failed probes will have relatively large mean detection p-values.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># calculate the detection p-values</styled-content>

                        <styled-content style="font-size:15px;">detP &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">detectionP</styled-content>
                        <styled-content style="font-size:15px;">(rgSet)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(detP)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;">
##            naive.1 rTreg.2  act_naive.3 naive.4 act_naive.5 act_rTreg.6
## cg00050873       0       0 0.000000e+00       0 0.00000e+00           0
## cg00212031       0       0 0.000000e+00       0 0.00000e+00           0
## cg00213748       0	    0 1.181832e-12       0 8.21565e-15           0
## cg00214611       0       0 0.000000e+00       0 0.00000e+00           0
## cg00455876       0       0 0.000000e+00       0 0.00000e+00           0
## cg01707559       0       0 0.000000e+00       0 0.00000e+00           0
##            naive.7      rTreg.8 act_naive.9 act_rTreg.10  birth.11
## cg00050873       0 0.000000e+00           0 0.000000e+00 0.0000000
## cg00212031       0 0.000000e+00           0 0.000000e+00 0.0000000
## cg00213748       0 1.469801e-05           0 1.365951e-08 0.6735224
## cg00214611       0 0.000000e+00           0 0.000000e+00 0.7344451
## cg00455876       0 0.000000e+00           0 0.000000e+00 0.0000000
## cg01707559       0 0.000000e+00           0 0.000000e+00 0.0000000</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># examine mean detection p-values across all samples to identify any failed samples</styled-content>

                        <styled-content style="font-size:15px;">pal &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Dark2"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">barplot</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">colMeans</styled-content>
                        <styled-content style="font-size:15px;">(detP),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">las=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">cex.names=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Mean detection p-values"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">abline</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">h=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.01</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"red"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">barplot</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">colMeans</styled-content>
                        <styled-content style="font-size:15px;">(detP),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">las=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">cex.names=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylim = c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.002</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Mean detection p-values"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>Figure 2. </label>
                    <caption>
                        <title>Mean detection p-values summarise the quality of the signal across all the probes in each sample.</title>
                        <p>The plot on the right is a zoomed in version of the plot on the left.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure2.gif"/>
                </fig>
                <p>The 
                    <italic toggle="yes">minfi</italic> 
                    <monospace>qcReport</monospace> function generates many other useful quality control plots. The 
                    <italic toggle="yes">minfi</italic> vignette describes the various plots and how they should be interpreted in detail. Generally, samples that look poor based on mean detection p-value will also look poor using other metrics and it is usually advisable to exclude them from further analysis.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">qcReport</styled-content>
                        <styled-content style="font-size:15px;">(rgSet,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampNames=</styled-content>
                        <styled-content style="font-size:15px;">targets$ID,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampGroups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content>
          
                        <styled-content style="font-size:15px;color:#214A87;">pdf=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"qcReport.pdf"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>Poor quality samples can be easily excluded from the analysis using a detection p-value cutoff, for example &gt;0.05. For this particular dataset, the 
                    <monospace>birth</monospace> sample shows a very high mean detection p-value, and hence it is excluded from subsequent analysis (
                    <xref ref-type="fig" rid="f2">Figure 2</xref>).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># remove poor quality samples</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">colMeans</styled-content>
                        <styled-content style="font-size:15px;">(detP) &lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>

                        <styled-content style="font-size:15px;">rgSet &lt;- rgSet[,keep]</styled-content>

                        <styled-content style="font-size:15px;">rgSet</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## RGChannelSet (storageMode: lockedEnvironment)
## assayData: 622399 features, 10 samples
##   element names: Green, Red
## An object of class 'AnnotatedDataFrame'
##   sampleNames: naive.1 rTreg.2 ... act_rTreg.10 (10 total)
##   varLabels: Sample_Name Sample_Well ... filenames (10 total)
##   varMetadata: labelDescription
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># remove poor quality samples from targets data</styled-content>

                        <styled-content style="font-size:15px;">targets &lt;- targets[keep,]</styled-content>

                        <styled-content style="font-size:15px;">targets[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>:
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">]</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">
##    Sample_Name Sample_Well Sample_Source Sample_Group Sample_Label
## 1	        1          A1           M28        naive        naive
## 2            2          B1           M28        rTreg	rTreg
## 3            3          C1           M28    act_naive    act_naive
## 4            4          D1           M29        naive        naive
## 5            5          E1           M29    act_naive    act_naive
## 6            6          F1           M29    act_rTreg    act_rTreg
## 7            7          G1           M30        naive        naive
## 8            8          H1           M30        rTreg        rTreg
## 9            9          A2           M30    act_naive    act_naive
## 10          10          B2           M30    act_rTreg    act_rTreg</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># remove poor quality samples from detection p-value table</styled-content>

                        <styled-content style="font-size:15px;">detP &lt;- detP[,keep]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">dim</styled-content>
                        <styled-content style="font-size:15px;">(detP)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## [1] 485512  10</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Normalisation</title>
                <p>To minimise the unwanted variation within and between samples, various data normalisations can be applied. Many different types of normalisation have been developed for methylation arrays and it is beyond the scope of this workflow to compare and contrast all of them (
                    <xref ref-type="bibr" rid="ref-12">Fortin 
                        <italic toggle="yes">et al.</italic>, 2014</xref>; 
                    <xref ref-type="bibr" rid="ref-24">Maksimovic 
                        <italic toggle="yes">et al.</italic>, 2012</xref>; 
                    <xref ref-type="bibr" rid="ref-25">Mancuso 
                        <italic toggle="yes">et al.</italic>, 2011</xref>; 
                    <xref ref-type="bibr" rid="ref-30">Pidsley 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-35">Sun 
                        <italic toggle="yes">et al.</italic>, 2011</xref>; 
                    <xref ref-type="bibr" rid="ref-36">Teschendorff 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-38">Touleimat &amp; Tost, 2012</xref>; 
                    <xref ref-type="bibr" rid="ref-39">Triche 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-40">Wang 
                        <italic toggle="yes">et al.</italic>, 2012</xref>; 
                    <xref ref-type="bibr" rid="ref-42">Wu 
                        <italic toggle="yes">et al.</italic>, 2014</xref>). Several methods have been built into 
                    <italic toggle="yes">minfi</italic> and can be directly applied within its framework (
                    <xref ref-type="bibr" rid="ref-12">Fortin 
                        <italic toggle="yes">et al.</italic>, 2014</xref>; 
                    <xref ref-type="bibr" rid="ref-24">Maksimovic 
                        <italic toggle="yes">et al.</italic>, 2012</xref>; 
                    <xref ref-type="bibr" rid="ref-39">Triche 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-38">Touleimat &amp; Tost, 2012</xref>), whilst others are 
                    <italic toggle="yes">methylumi</italic>-specific or require custom data types (
                    <xref ref-type="bibr" rid="ref-25">Mancuso 
                        <italic toggle="yes">et al.</italic>, 2011</xref>; 
                    <xref ref-type="bibr" rid="ref-30">Pidsley 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-35">Sun 
                        <italic toggle="yes">et al.</italic>, 2011</xref>; 
                    <xref ref-type="bibr" rid="ref-36">Teschendorff 
                        <italic toggle="yes">et al.</italic>, 2013</xref>; 
                    <xref ref-type="bibr" rid="ref-40">Wang 
                        <italic toggle="yes">et al.</italic>, 2012</xref>; 
                    <xref ref-type="bibr" rid="ref-42">Wu 
                        <italic toggle="yes">et al.</italic>, 2014</xref>). Although there is no single normalisation method that is universally considered best, a recent study by 
                    <xref ref-type="bibr" rid="ref-12">Fortin 
                        <italic toggle="yes">et al.</italic> (2014)</xref> has suggested that a good rule of thumb within the 
                    <italic toggle="yes">minfi</italic> framework is that the 
                    <monospace>preprocessFunnorm</monospace> (
                    <xref ref-type="bibr" rid="ref-12">Fortin 
                        <italic toggle="yes">et al.</italic>, 2014</xref>) function is most appropriate for datasets with global methylation differences such as cancer/normal or vastly different tissue types, whilst the 
                    <monospace>preprocessQuantile</monospace> function (
                    <xref ref-type="bibr" rid="ref-38">Touleimat &amp; Tost, 2012</xref>) is more suited for datasets where you do not expect global differences between your samples, for example a single tissue. Further discussion on appropriate choice of normalisation can be found in (
                    <xref ref-type="bibr" rid="ref-15">Hicks &amp; Irizarry, 2015</xref>), and the accompanying 
                    <italic toggle="yes">quantro</italic> package includes data-driven tests for the assumptions of quantile normalisation. As we are comparing different blood cell types, which are globally relatively similar, we will apply the 
                    <monospace>preprocessQuantile</monospace> method to our data (
                    <xref ref-type="fig" rid="f3">Figure 3</xref>). This function implements a stratified quantile normalisation procedure which is applied to the methylated and unmethylated signal intensities separately, and takes into account the different probe types. Note that after normalisation, the data is housed in a 
                    <monospace>GenomicRatioSet</monospace> object. This is a much more compact representation of the data as the colour channel information has been discarded and the 
                    <italic toggle="yes">M</italic> and 
                    <italic toggle="yes">U</italic> intensity information has been converted to M-values and beta values, together with associated genomic coordinates. Note, running the 
                    <monospace>preprocessQuantile</monospace> function on this dataset produces the warning: &#x2018;An inconsistency was encountered while determining sex&#x2019;; this can be ignored as it is due to all the samples being from male donors.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># normalize the data; this results in a GenomicRatioSet object</styled-content>

                        <styled-content style="font-size:15px;">mSetSq &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">preprocessQuantile</styled-content>
                        <styled-content style="font-size:15px;">(rgSet)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;">
## [preprocessQuantile] Mapping to genome.

## [preprocessQuantile] Fixing outliers.

## Warning in .getSex(CN = CN, xIndex = xIndex, yIndex = yIndex, cutoff
## = cutoff): An inconsistency was encountered while determining sex. One
## possibility is that only one sex is present. We recommend further checks,
## for example with the plotSex function.

## [preprocessQuantile] Quantile normalizing.</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># create a MethylSet object from the raw data for plotting</styled-content>

                        <styled-content style="font-size:15px;">mSetRaw &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">preprocessRaw</styled-content>
                        <styled-content style="font-size:15px;">(rgSet)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903;"># visualise what the data looks like before and after normalisation</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">densityPlot</styled-content>
                        <styled-content style="font-size:15px;">(rgSet,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampGroups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Raw"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend = levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">text.col=brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Dark2"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">densityPlot</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getBeta</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampGroups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content>
              
                        <styled-content style="font-size:15px;color:#214A87;">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Normalized"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend = levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">text.col=brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Dark2"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>Figure 3. </label>
                    <caption>
                        <title>The density plots show the distribution of the beta values for each sample before and after normalisation.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure3.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Data exploration</title>
                <p>Multi-dimensional scaling (MDS) plots are excellent for visualising data, and are usually some of the first plots that should be made when exploring the data. MDS plots are based on principal components analysis and are an unsupervised method for looking at the similarities and differences between the various samples. Samples that are more similar to each other should cluster together, and samples that are very different should be further apart on the plot. Dimension one (or principal component one) captures the greatest source of variation in the data, dimension two captures the second greatest source of variation in the data and so on. Colouring the data points or labels by known factors of interest can often highlight exactly what the greatest sources of variation are in the data. It is also possible to use MDS plots to decipher sample mix-ups.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># MDS plots to look at largest sources of variation</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)])</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)])</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>Examining the MDS plots for this dataset demonstrates that the largest source of variation is the difference between individuals (
                    <xref ref-type="fig" rid="f4">Figure 4</xref>). The higher dimensions reveal that the differences between cell types are largely captured by the third and fourth principal components (
                    <xref ref-type="fig" rid="f5">Figure 5</xref>). This type of information is useful in that it can inform downstream analysis. If obvious sources of unwanted variation are revealed by the MDS plots, we can include them in our statistical model to account for them. In the case of this particular dataset, we will include individual to individual variation in our statistical model.</p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>Figure 4. </label>
                    <caption>
                        <title>Multi-dimensional scaling plots are a good way to visualise the relationships between the samples in an experiment.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure4.gif"/>
                </fig>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># Examine higher dimensions to look at other sources of variation</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">4</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topright"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f5" orientation="portrait" position="float">
                    <label>Figure 5. </label>
                    <caption>
                        <title>Examining the higher dimensions of an MDS plot can reaveal significant sources of variation in the data.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure5.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Filtering</title>
                <p>Poor performing probes are generally filtered out prior to differential methylation analysis. As the signal from these probes is unreliable, by removing them we perform fewer statistical tests and thus incur a reduced multiple testing penalty. We filter out probes that have failed in one or more samples based on detection p-value.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># ensure probes are in the same order in the mSetSq and detP objects</styled-content>

                        <styled-content style="font-size:15px;">detP &lt;- detP[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq),</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;">(detP)),]</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903;"># remove any probes that have failed in one or more samples</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rowSums</styled-content>
                        <styled-content style="font-size:15px;">(detP &lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">0.01</styled-content>
                        <styled-content style="font-size:15px;">) ==</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ncol</styled-content>
                        <styled-content style="font-size:15px;">(mSetSq)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;">(keep)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## keep
##  FALSE    TRUE
##    977  484535</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">mSetSqFlt &lt;- mSetSq[keep,]
mSetSqFlt</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## class: GenomicRatioSet
## dim: 484535 10
## metadata(0):
## assays(2): M CN
## rownames(484535): cg13869341 cg14008030 ... cg08265308 cg14273923
## rowData names(0):
## colnames(10): naive.1 rTreg.2 ... act_naive.9 act_rTreg.10
## colData names(11): Sample_Name Sample_Well ... filenames
##   predictedSex
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19
## Preprocessing
##   Method: Raw (no normalization or bg correction)
##   minfi version: 1.20.2
##   Manifest version: 0.4.0</styled-content>
                    </preformat>
                </p>
                <p>Depending on the nature of your samples and your biological question you may also choose to filter out the probes from the X and Y chromosomes or probes that are known to have common SNPs at the CpG site. As the samples in this dataset were all derived from male donors, we will not be removing the sex chromosome probes as part of this analysis, however example code is provided below. A different dataset, which contains both male and female samples, is used to demonstrate a 
                    <xref ref-type="other" rid="DV">Differential Variability</xref> analysis and provides an example of when sex chromosome removal is necessary (
                    <xref ref-type="fig" rid="f13">Figure 13</xref>).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># if your data includes males and females, remove probes on the sex chromosomes</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;- !(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt) %in% ann450k$Name[ann450k$chr %in%</styled-content>
                                                               
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"chrX"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"chrY"</styled-content>
                        <styled-content style="font-size:15px;">)])</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;">(keep)
mSetSqFlt &lt;- mSetSqFlt[keep,]</styled-content>
                    </preformat>
                </p>
                <p>There is a function in 
                    <italic toggle="yes">minfi</italic> that provides a simple interface for the removal of probes where common SNPs may affect the CpG. You can either remove all probes affected by SNPs (default), or only those with minor allele frequencies greater than a specified value.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># remove probes with SNPs at CpG site</styled-content>

                        <styled-content style="font-size:15px;">mSetSqFlt &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dropLociWithSnps</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt)
mSetSqFlt</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## class: GenomicRatioSet
## dim: 467351 10
## metadata(0):
## assays(2): M CN
## rownames(467351): cg13869341 cg14008030 ... cg08265308 cg14273923
## rowData names(0):
## colnames(10): naive.1 rTreg.2 ... act_naive.9 act_rTreg.10
## colData names(11): Sample_Name Sample_Well ... filenames
##   predictedSex
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19
## Preprocessing
##   Method: Raw (no normalization or bg correction)
##   minfi version: 1.20.2
##   Manifest version: 0.4.0</styled-content>
                    </preformat>
                </p>
                <p>We will also filter out probes that have shown to be cross-reactive, that is, probes that have been demonstrated to map to multiple places in the genome. This list was originally published by 
                    <xref ref-type="bibr" rid="ref-7">Chen 
                        <italic toggle="yes">et al.</italic> (2013)</xref> and can be obtained from the authors&#x2019; 
                    <ext-link ext-link-type="uri" xlink:href="http://www.sickkids.ca/MS-Office-Files/Research/Weksberg Lab/48639-non-specific-probes-Illumina450k.xlsx">website</ext-link>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># exclude cross reactive probes</styled-content>

                        <styled-content style="font-size:15px;">xReactiveProbes &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">read.csv</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">file=paste</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content>
                               
                        <styled-content style="font-size:15px;color:#4F9905;">"48639-non-specific-probes-Illumina450k.csv"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                               
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"/"</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stringsAsFactors=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)
keep &lt;- !(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt) %in% xReactiveProbes$TargetID)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">table</styled-content>
                        <styled-content style="font-size:15px;">(keep)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## keep
##  FALSE   TRUE
##  27433 439918</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">mSetSqFlt &lt;- mSetSqFlt[keep,]
mSetSqFlt</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## class: GenomicRatioSet
## dim: 439918 10
## metadata(0):
## assays(2): M CN
## rownames(439918): cg13869341 cg24669183 ... cg08265308 cg14273923
## rowData names(0):
## colnames(10): naive.1 rTreg.2 ... act_naive.9 act_rTreg.10
## colData names(11): Sample_Name Sample_Well ... filenames
##   predictedSex
## Annotation
##   array: IlluminaHumanMethylation450k
##   annotation: ilmn12.hg19
## Preprocessing
##   Method: Raw (no normalization or bg correction)
##   minfi version: 1.20.2
##   Manifest version: 0.4.0</styled-content>
                    </preformat>
                </p>
                <p>Once the data has been filtered and normalised, it is often useful to re-examine the MDS plots to see if the relationship between the samples has changed. It is apparent from the new MDS plots that much of the inter-individual variation has been removed as this is no longer the first principal component (
                    <xref ref-type="fig" rid="f6">Figure 6</xref>), likely due to the removal of the SNP-affected CpG probes. However, the samples do still cluster by individual in the second dimension (
                    <xref ref-type="fig" rid="f6">Figure 6</xref> and 
                    <xref ref-type="fig" rid="f7">Figure 7</xref>) and thus a factor for individual should still be included in the model.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"right"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.65</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)])</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"right"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>
        
                        <styled-content style="font-size:15px;color:#8F5903;"># Examine higher dimensions to look at other sources of variation</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"right"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topright"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87;">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">4</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"right"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">text.col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.7</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">bg=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f6" orientation="portrait" position="float">
                    <label>Figure 6. </label>
                    <caption>
                        <title>Removing SNP-affected CpGs probes from the data changes the sample clustering in the MDS plots.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure6.gif"/>
                </fig>
                <fig fig-type="figure" id="f7" orientation="portrait" position="float">
                    <label>Figure 7. </label>
                    <caption>
                        <title>Examining the higher dimensions of the MDS plots shows that significant inter-individual variation still exists in the second and third principal components.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure7.gif"/>
                </fig>
                <p>The next step is to calculate M-values and beta values (
                    <xref ref-type="fig" rid="f8">Figure 8</xref>). As previously mentioned, M-values have nicer statistical properties and are thus better for use in statistical analysis of methylation data whilst beta values are easy to interpret and are thus better for displaying data. A detailed comparison of M-values and beta values was published by 
                    <xref ref-type="bibr" rid="ref-10">Du 
                        <italic toggle="yes">et al.</italic> (2010)</xref>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># calculate M-values for statistical analysis</styled-content>

                        <styled-content style="font-size:15px;">mVals &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getM</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(mVals[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">])</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##              naive.1   rTreg.2 act_naive.3   naive.4  act_naive.5
## cg13869341  2.421276  2.515948    2.165745  2.286314     2.109441
## cg24669183  2.169414  2.235964    2.280734  1.632309     2.184435
## cg15560884  1.761176  1.577578    1.597503  1.777486     1.764999
## cg01014490 -3.504268 -3.825119   -5.384735 -4.537864    -4.296526
## cg17505339  3.082191  3.924931    4.163206  3.255373     3.654134
## cg11954957  1.546401  1.912204    1.727910  2.441267     1.618331</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">bVals &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">getBeta</styled-content>
                        <styled-content style="font-size:15px;">(mSetSqFlt)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(bVals[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">])</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##               naive.1    rTreg.2 act_naive.3    naive.4 act_naive.5
## cg13869341 0.84267937 0.85118462   0.8177504 0.82987650  0.81186174
## cg24669183 0.81812908 0.82489238   0.8293297 0.75610281  0.81967323
## cg15560884 0.77219626 0.74903910   0.7516263 0.77417882  0.77266205
## cg01014490 0.08098986 0.06590459   0.0233755 0.04127262  0.04842397
## cg17505339 0.89439216 0.93822870   0.9471357 0.90520570  0.92641305
## cg11954957 0.74495496 0.79008516   0.7681146 0.84450764  0.75431167</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">densityPlot</styled-content>
                        <styled-content style="font-size:15px;">(bVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampGroups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Beta values"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
             
                        <styled-content style="font-size:15px;color:#214A87;">legend=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">xlab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Beta values"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend = levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">text.col=brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Dark2"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">densityPlot</styled-content>
                        <styled-content style="font-size:15px;">(mVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sampGroups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"M-values"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
             
                        <styled-content style="font-size:15px;color:#214A87;">legend=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">xlab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"M values"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend = levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)),</styled-content>
        
                        <styled-content style="font-size:15px;color:#214A87;">text.col=brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Dark2"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f8" orientation="portrait" position="float">
                    <label>Figure 8. </label>
                    <caption>
                        <title>The distributions of beta and M-values are quite different; beta values are constrained between 0 and 1 whilst M-values range between -Inf and Inf.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure8.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Probe-wise differential methylation analysis</title>
                <p>The biological question of interest for this particular dataset is to discover differentially methylated probes between the different cell types. However, as was apparent in the MDS plots, there is another factor that we need to take into account when we perform the statistical analysis. In the 
                    <monospace>targets</monospace> file, there is a column called 
                    <monospace>Sample_Source</monospace>, which refers to the individuals that the samples were collected from. In this dataset, each of the individuals contributes more than one cell type. For example, individual M28 contributes 
                    <monospace>naive</monospace>, 
                    <monospace>rTreg</monospace> and 
                    <monospace>act_naive</monospace> samples. Hence, when we specify our design matrix, we need to include two factors: individual and cell type. This style of analysis is called a paired analysis; differences between cell types are calculated 
                    <italic toggle="yes">within</italic> each individual, and then these differences are averaged 
                    <italic toggle="yes">across</italic> individuals to determine whether there is an overall significant difference in the mean methylation level for each CpG site. The 
                    <italic toggle="yes">limma</italic> 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf">User&#x2019;s Guide</ext-link> extensively covers the different types of designs that are commonly used for microarray experiments and how to analyse them in R.</p>
                <p>We are interested in pairwise comparisons between the four cell types, taking into account individual to individual variation. We perform this analysis on the matrix of M-values in 
                    <italic toggle="yes">limma</italic>, obtaining moderated t-statistics and associated p-values for each CpG site. A convenient way to set up the model when the user has many comparisons of interest that they would like to test is to use a contrasts matrix in conjunction with the design matrix. A contrasts matrix will take linear combinations of the columns of the design matrix corresponding to the comparisons of interest.</p>
                <p>Since we are performing hundreds of thousands of hypothesis tests, we need to adjust the p-values for multiple testing. A common procedure for assessing how statistically significant a change in mean levels is between two groups when a very large number of tests is being performed is to assign a cut-off on the false discovery rate (
                    <xref ref-type="bibr" rid="ref-3">Benjamini &amp; Hochberg, 1995</xref>), rather than on the unadjusted p-value. Typically 5% FDR is used, and this is interpreted as the researcher willing to accept that from the list of significant differentially methylated CpG sites, 5% will be false discoveries. If the p-values are not adjusted for multiple testing, the number of false discoveries will be unacceptably high. For this dataset, assuming a Type I error rate of 5%, we would expect to see 0.05*439918=21996 statistical significant results for a given comparison, even if there were truly no differentially methylated CpG sites.</p>
                <p>Based on a false discovery rate of 5%, there are 3021 significantly differentially methylated CpGs in the 
                    <monospace>na&#x00ef;ve</monospace> vs 
                    <monospace>rTreg</monospace> comparison, while 
                    <monospace>rTreg</monospace> vs 
                    <monospace>act_rTreg</monospace> doesn&#x2019;t show any significant differential methylation.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># this is the factor of interest</styled-content>

                        <styled-content style="font-size:15px;">cellType &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># this is the individual effect that we need to account for</styled-content>

                        <styled-content style="font-size:15px;">individual &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Source)</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903;"># use the above to create a design matrix</styled-content>

                        <styled-content style="font-size:15px;">design &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">model.matrix</styled-content>
                        <styled-content style="font-size:15px;">(~</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">+cellType+individual,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">data=</styled-content>
                        <styled-content style="font-size:15px;">targets)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">colnames</styled-content>
                        <styled-content style="font-size:15px;">(design) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">levels</styled-content>
                        <styled-content style="font-size:15px;">(cellType),</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">levels</styled-content>
                        <styled-content style="font-size:15px;">(individual)[-</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">])</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903;"># fit the linear model</styled-content>

                        <styled-content style="font-size:15px;">fit &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">lmFit</styled-content>
                        <styled-content style="font-size:15px;">(mVals, design)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># create a contrast matrix for specific comparisons</styled-content>

                        <styled-content style="font-size:15px;">contMatrix &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">makeContrasts</styled-content>
                        <styled-content style="font-size:15px;">(naive-rTreg,
			   naive-act_naive,
			   rTreg-act_rTreg,
			   act_naive-act_rTreg,</styled-content>
			      
                        <styled-content style="font-size:15px;color:#214A87;">levels=</styled-content>
                        <styled-content style="font-size:15px;">design)
contMatrix</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## 	     Contrasts
## Levels     naive - rTreg naive - act_naive rTreg - act_rTreg
##   act_naive 		  0 		   -1 		      0
##   act_rTreg 		  0 		    0 		     -1
##   naive 		  1 		    1 		      0
##   rTreg 		 -1 		    0 		      1
##   M29 		  0 		    0 		      0
##   M30 		  0 		    0 		      0
## 	     Contrasts
## Levels     act_naive - act_rTreg
##   act_naive 		  	  1
##   act_rTreg 		  	 -1
##   naive 		  	  0
##   rTreg 		  	  0
##   M29 		  	  0
##   M30 		  	  0</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># fit the contrasts</styled-content>

                        <styled-content style="font-size:15px;">fit2 &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">contrasts.fit</styled-content>
                        <styled-content style="font-size:15px;">(fit, contMatrix)
fit2 &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">eBayes</styled-content>
                        <styled-content style="font-size:15px;">(fit2)</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903;"># look at the numbers of DM CpGs at FDR &lt; 0.05</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">summary</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">decideTests</styled-content>
                        <styled-content style="font-size:15px;">(fit2))</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##     naive - rTreg naive - act_naive rTreg - act_rTreg act_naive - act_rTreg
## -1 		1618 		   400 		       0 		   559
## 0 	      436897 		439291 		  439918 	        438440
## 1 		1403 		   227 		       0 		   919</styled-content>
                    </preformat>
                </p>
                <p>We can extract the tables of differentially expressed CpGs for each comparison, ordered by B-statistic by default, using the 
                    <monospace>topTable</monospace> function in 
                    <italic toggle="yes">limma</italic>. The B-statistic is the log-odds of differential methylation, first published by Lonnstedt and Speed (
                    <xref ref-type="bibr" rid="ref-22">Lonnstedt &amp; Speed, 2002</xref>). To order by p-value, the user can specify 
                    <monospace>sort.by="p"</monospace>; and in most cases, the ordering based on the p-value and ordering based on the B-statistic will be identical. The results of the analysis for the first comparison, 
                    <monospace>naive</monospace> vs. 
                    <monospace>rTreg</monospace>, can be saved as a 
                    <monospace>data.frame</monospace> by setting 
                    <monospace>coef=1</monospace>. The 
                    <monospace>coef</monospace> parameter explicitly refers to the column in the contrasts matrix which corresponds to the comparison of interest.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># get the table of results for the first contrast (naive - rTreg)</styled-content>

                        <styled-content style="font-size:15px;">ann450kSub &lt;- ann450k[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;">(mVals),ann450k$Name),</styled-content>
                         
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">4</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">12</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">19</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">24</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">ncol</styled-content>
                        <styled-content style="font-size:15px;">(ann450k))]
DMPs &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">topTable</styled-content>
                        <styled-content style="font-size:15px;">(fit2,</styled-content>  
                        <styled-content style="font-size:15px;color:#214A87;">num=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">Inf</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">coef=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">genelist=</styled-content>
                        <styled-content style="font-size:15px;">ann450kSub)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(DMPs)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## 		chr 	  pos strand 	   Name Probe_rs Probe_maf CpG_rs
## cg07499259  chr1  12188502 	   + cg07499259     &lt;NA&gt; 	NA   &lt;NA&gt;
## cg26992245  chr8  29848579 	   - cg26992245     &lt;NA&gt; 	NA   &lt;NA&gt;
## cg09747445 chr15  70387268 	   - cg09747445     &lt;NA&gt; 	NA   &lt;NA&gt;
## cg18808929  chr8  61825469 	   - cg18808929     &lt;NA&gt; 	NA   &lt;NA&gt;
## cg25015733  chr2  99342986 	   - cg25015733     &lt;NA&gt; 	NA   &lt;NA&gt;
## cg21179654  chr3 114057297 	   + cg21179654     &lt;NA&gt; 	NA   &lt;NA&gt;
## 	      CpG_maf SBE_rs SBE_maf 		Islands_Name
## cg07499259 	   NA   &lt;NA&gt; 	  NA
## cg26992245 	   NA   &lt;NA&gt; 	  NA
## cg09747445 	   NA   &lt;NA&gt; 	  NA chr15:70387929-70393206
## cg18808929 	   NA   &lt;NA&gt; 	  NA  chr8:61822358-61823028
## cg25015733 	   NA   &lt;NA&gt; 	  NA  chr2:99346882-99348177
## cg21179654 	   NA   &lt;NA&gt; 	  NA
## 	      Relation_to_Island
## cg07499259 		 OpenSea
## cg26992245 		 OpenSea
## cg09747445 		 N_Shore
## cg18808929 		 S_Shelf
## cg25015733 		 N_Shelf
## cg21179654 		 OpenSea
## 					     UCSC_RefGene_Name
## cg07499259 				       TNFRSF8;TNFRSF8
## cg26992245
## cg09747445 				        TLE3;TLE3;TLE3
## cg18808929
## cg25015733 				          	MGAT4A
## cg21179654 ZBTB20;ZBTB20;ZBTB20;ZBTB20;ZBTB20;ZBTB20;ZBTB20
## 									       
## 									      
##   UCSC_RefGene_Accession								      
## cg07499259 					        NM_152942;NM_001243
## cg26992245
## cg09747445 							    
## cg18808929
## cg25015733 										    
## cg21179654 NM_001164343;NM_001164346;NM_001164345;NM_001164342;
## 				     UCSC_RefGene_Group Phantom DMR Enhancer
## cg07499259 				     5'UTR;Body
## cg26992245 								TRUE
## cg09747445			         Body;Body;Body
## cg18808929 								TRUE
## cg25015733 				          5'UTR
## cg21179654 3'UTR;3'UTR;3'UTR;3'UTR;3'UTR;3'UTR;3'UTR
## 		       HMM_Island Regulatory_Feature_Name
## cg07499259 1:12111023-12111225
## cg26992245
## cg09747445
## cg18808929
## cg25015733
## cg21179654 			    3:114057192-114057775
## 		     Regulatory_Feature_Group DHS     logFC     AveExpr
## cg07499259 					   3.654104  2.46652171
## cg26992245 					   4.450696 -0.09180715
## cg09747445 					  -3.337299 -0.25201484
## cg18808929 					  -2.990263  0.77522878
## cg25015733 					  -3.054336  0.83280190
## cg21179654 Unclassified_Cell_type_specific 	   2.859016  1.32460816
## 		      t      P.Value   adj.P.Val	B
## cg07499259  18.73131 7.267204e-08 0.005067836 7.453206
## cg26992245  18.32674 8.615461e-08 0.005067836 7.359096
## cg09747445 -18.24438 8.923101e-08 0.005067836 7.339443
## cg18808929 -17.90181 1.034217e-07 0.005067836 7.255825
## cg25015733 -17.32615 1.333546e-07 0.005067836 7.108231
## cg21179654  17.27804 1.362674e-07 0.005067836 7.095476</styled-content>
                    </preformat>
                </p>
                <p>The resulting 
                    <monospace>data.frame</monospace> can easily be written to a CSV file, which can be opened in Excel.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">write.table</styled-content>
                        <styled-content style="font-size:15px;">(DMPs,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">file=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DMPs.csv"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">","</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">row.names=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>It is always useful to plot sample-wise methylation levels for the top differentially methylated CpG sites to quickly ensure the results make sense (
                    <xref ref-type="fig" rid="f9">Figure 9</xref>). If the plots do not look as expected, it is usually an indication of an error in the code, or in setting up the design matrix. It is easier to interpret methylation levels on the beta value scale, so although the analysis is performed on the M-value scale, we visualise data on the beta value scale. The 
                    <monospace>plotCpg</monospace> function in 
                    <italic toggle="yes">minfi</italic> is a convenient way to plot the sample-wise beta values stratified by the grouping variable.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># plot the top 4 most significantly differentially methylated CpGs</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">sapply</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;">(DMPs)[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">4</styled-content>
                        <styled-content style="font-size:15px;">], </styled-content>
                        <styled-content style="font-size:15px;">function(cpg){</styled-content>
  
                        <styled-content style="font-size:15px;color:#214A87;">plotCpg</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;">bVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cpg=</styled-content>
                        <styled-content style="font-size:15px;">cpg,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pheno=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylab =</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"Beta values"</styled-content>
                        <styled-content style="font-size:15px;">)
})</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## $cg07499259
## NULL
##
## $cg26992245
## NULL
##
## $cg09747445
## NULL
##
## $cg18808929
## NULL</styled-content>
                    </preformat>
                </p>
            </sec>
            <sec>
                <title>Differential methylation analysis of regions</title>
                <p>Although performing a 
                    <italic toggle="yes">probe-wise</italic> analysis is useful and informative, sometimes we are interested in knowing whether several proximal CpGs are concordantly differentially methylated, that is, we want to identify differentially methylated 
                    <italic toggle="yes">regions</italic>. There are several Bioconductor packages that have functions for identifying differentially methylated regions from 450k data. Some of the most popular are the 
                    <monospace>dmrFind</monospace> function in the 
                    <ext-link ext-link-type="uri" xlink:href="http://www.bioconductor.org/packages/release/bioc/html/charm.html">charm</ext-link> package, which has been somewhat superseded for 450k arrays by the 
                    <monospace>bumphunter</monospace> function in 
                    <ext-link ext-link-type="uri" xlink:href="http://bioconductor.org/packages/release/bioc/html/minfi.html">minfi</ext-link>(
                    <xref ref-type="bibr" rid="ref-1">Aryee 
                        <italic toggle="yes">et al.</italic>, 2014</xref>; 
                    <xref ref-type="bibr" rid="ref-19">Jaffe 
                        <italic toggle="yes">et al.</italic>, 2012</xref>), and, the recently published 
                    <monospace>dmrcate</monospace> in the 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/release/bioc/html/DMRcate.html">DMRcate</ext-link> package (
                    <xref ref-type="bibr" rid="ref-27">Peters 
                        <italic toggle="yes">et al.</italic>, 2015</xref>). They are each based on different statistical methods. In our experience, the 
                    <monospace>bumphunter</monospace> and 
                    <monospace>dmrFind</monospace> functions can be somewhat slow to run unless you have the computer infrastructure to parallelise them, as they use permutations to assign significance. In this workflow, we will perform an analysis using the 
                    <monospace>dmrcate</monospace>. As it is based on 
                    <italic toggle="yes">limma</italic>, we can directly use the 
                    <monospace>design</monospace> and 
                    <monospace>contMatrix</monospace> we previously defined.</p>
                <p>Firstly, our matrix of M-values is annotated with the relevant information about the probes such as their genomic position, gene annotation, etc. By default, this is done using the 
                    <monospace>ilmn12.hg19</monospace> annotation, but this can be substituted for any argument compatible with the interface provided by the 
                    <italic toggle="yes">minfi</italic> package. The 
                    <italic toggle="yes">limma</italic> pipeline is then used for differential methylation analysis to calculate moderated t-statistics.</p>
                <fig fig-type="figure" id="f9" orientation="portrait" position="float">
                    <label>Figure 9. </label>
                    <caption>
                        <title>Plotting the top few differentially methylated CpGs is a good way to check whether the results make sense.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure9.gif"/>
                </fig>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">myAnnotation &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cpg.annotate</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">object = </styled-content>
                        <styled-content style="font-size:15px;">mVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">datatype =</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"array"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">what = </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"M"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                                
                        <styled-content style="font-size:15px;color:#214A87;">analysis.type=</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"differential"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">design = </styled-content>
                        <styled-content style="font-size:15px;">design,</styled-content>
                                
                        <styled-content style="font-size:15px;color:#214A87;">contrasts =</styled-content> 
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cont.matrix =</styled-content> 
                        <styled-content style="font-size:15px;">contMatrix,</styled-content>
                                
                        <styled-content style="font-size:15px;color:#214A87;">coef = </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"naive - rTreg"</styled-content>
                        <styled-content style="font-size:15px;">, </styled-content>
                        <styled-content style="font-size:15px;">arraytype = </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"450K"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
 
                        <styled-content style="font-size:15px;">## Your contrast returned 3021 individually significant probes.</styled-content>
 
                        <styled-content style="font-size:15px;">## We recommend the default setting of pcutoff in dmrcate().</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">str</styled-content>
                        <styled-content style="font-size:15px;">(myAnnotation)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## List of 7
## $ ID   :Factor w/ 439918 levels "cg00000029","cg00000108",..: 232388 391918 260351 ...
##  $ stat  : num [1:439918] 0.0489 -2.0773 0.7711 -0.0304 -0.764 ...
##  $ CHR   : Factor w/ 24 levels "chr1","chr10",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ pos   : int [1:439918] 15865 534242 710097 714177 720865 758829 763119 779995 ...
##  $ betafc: num [1:439918] 0.00039 -0.04534 0.01594 0.00251 -0.00869 ...
##  $ indfdr: num [1:439918] 0.994 0.565 0.872 0.997 0.873 ...
##  $ is.sig: logi [1:439918] FALSE FALSE FALSE FALSE FALSE FALSE ...
##  - attr(*, "row.names")= int [1:439918] 1 2 3 4 5 6 7 8 9 10 ...
##  - attr(*, "class")= chr "annot"</styled-content>
                    </preformat>
                </p>
                <p>Once we have the relevant statistics for the individual CpGs, we can then use the 
                    <monospace>dmrcate</monospace> function to combine them to identify differentially methylated regions. The main output table 
                    <monospace>DMRs$results</monospace> contains all of the regions found, along with their genomic annotations and p-values.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">DMRs &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmrcate</styled-content>
                        <styled-content style="font-size:15px;">(myAnnotation,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">lambda=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">C=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(DMRs$results)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##                          coord no.cpgs        minfdr     Stouffer
## 452    chr17:57915665-57918682      12  4.957890e-91 6.639928e-10
## 723   chr3:114012316-114012912       5 1.622885e-180 1.515378e-07
## 464    chr17:74639731-74640078       6  9.516873e-90 1.527961e-07
## 1053    chrX:49121205-49122718       6  6.753751e-84 2.936984e-07
## 487    chr18:21452730-21453131       7 5.702319e-115 7.674943e-07
## 186  chr10:135202522-135203200       6  1.465070e-65 7.918224e-07
##       maxbetafc meanbetafc
## 452   0.3982862  0.3131611
## 723   0.5434277  0.4251622
## 464  -0.2528645 -0.1951904
## 1053  0.4529088  0.3006242
## 487  -0.3867474 -0.2546089
## 186   0.2803157  0.2293419</styled-content>
                    </preformat>
                </p>
                <p>As for the probe-wise analysis, it is advisable to visualise the results to ensure that they make sense. The regions can easily be viewed using the 
                    <monospace>DMR.plot</monospace> function provided in the 
                    <italic toggle="yes">DMRcate</italic> package (
                    <xref ref-type="fig" rid="f10">Figure 10</xref>).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># convert the regions to annotated genomic ranges</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">data</styled-content>
                        <styled-content style="font-size:15px;">(dmrcatedata)
results.ranges &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">extractRanges</styled-content>
                        <styled-content style="font-size:15px;">(DMRs,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">genome =</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"hg19"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># set up the grouping variables and colours</styled-content>

                        <styled-content style="font-size:15px;">groups &lt;- pal[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">length</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">unique</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group))]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">names</styled-content>
                        <styled-content style="font-size:15px;">(groups) &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group))
cols &lt;- groups[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">as.character</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">factor</styled-content>
                        <styled-content style="font-size:15px;">(targets$Sample_Group))]
samps &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;">(targets)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># draw the plot for the top DMR</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">DMR.plot(ranges=</styled-content>
                        <styled-content style="font-size:15px;">results.ranges,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">dmr=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">CpGs=</styled-content>
                        <styled-content style="font-size:15px;">bVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">phen.col=</styled-content>
                        <styled-content style="font-size:15px;">cols,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;"> what = </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Beta"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
          
                        <styled-content style="font-size:15px;color:#214A87;">arraytype = </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"450K"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">pch=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">16</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">toscale=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">plotmedians=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
          
                        <styled-content style="font-size:15px;color:#214A87;">genome=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"hg19"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">samps=</styled-content>
                        <styled-content style="font-size:15px;">samps)</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f10" orientation="portrait" position="float">
                    <label>Figure 10. </label>
                    <caption>
                        <title>The DMRcate &#x201c;DMR.plot&#x201d; function allows you to quickly visualise DMRs in their genomic context.</title>
                        <p>By default, the plot shows the location of the DMR in the genome, the position of any genes that are nearby, the base pair positions of the CpG probes, the methylation levels of the individual samples as a heatmap and the mean methylation levels for the various sample groups in the experiment. This plot shows the top ranked DMR identified by the DMRcate analysis.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure10.gif"/>
                </fig>
            </sec>
            <sec>
                <title>Customising visualisations of methylation data</title>
                <p>The 
                    <italic toggle="yes">Gviz</italic> package offers powerful functionality for plotting methylation data in its genomic context. The package 
                    <ext-link ext-link-type="uri" xlink:href="https://bioconductor.org/packages/release/bioc/vignettes/Gviz/inst/doc/Gviz.pdf">vignette</ext-link> is very extensive and covers the various types of plots that can be produced using the 
                    <italic toggle="yes">Gviz</italic> framework. We will plot one of the differentially methylated regions from the 
                    <italic toggle="yes">DMRcate</italic> analysis to demonstrate the type of visualisations that can be created (
                    <xref ref-type="fig" rid="f11">Figure 11</xref>).</p>
                <fig fig-type="figure" id="f11" orientation="portrait" position="float">
                    <label>Figure 11. </label>
                    <caption>
                        <title>The Gviz package provides extensive functionality for customising plots of genomic regions.</title>
                        <p>This plot shows the top ranked DMR identified by the DMRcate analysis.</p>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure11.gif"/>
                </fig>
                <p>We will first set up the genomic region we would like to plot by extracting the genomic coordinates of the top differentially methylated region.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># indicate which genome is being used</styled-content>

                        <styled-content style="font-size:15px;">gen &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905;">"hg19"</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># the index of the DMR that we will plot</styled-content>

                        <styled-content style="font-size:15px;">dmrIndex &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># extract chromosome number and location from DMR results</styled-content>

                        <styled-content style="font-size:15px;">coords &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strsplit2</styled-content>
                        <styled-content style="font-size:15px;">(DMRs$results$coord[dmrIndex],</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">":"</styled-content>
                        <styled-content style="font-size:15px;">)
chrom &lt;- coords[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">]
start &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.numeric</styled-content>(
                        <styled-content style="font-size:15px;color:#214A87;">strsplit2</styled-content>
                        <styled-content style="font-size:15px;">(coords[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"-"</styled-content>
                        <styled-content style="font-size:15px;">)[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">])
end &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.numeric</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">strsplit2</styled-content>
                        <styled-content style="font-size:15px;">(coords[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"-"</styled-content>
                        <styled-content style="font-size:15px;">)[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>])

                        <styled-content style="font-size:15px;color:#8F5903;"># add 25% extra space to plot</styled-content>

                        <styled-content style="font-size:15px;">minbase &lt;- start - (</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.25</styled-content>
                        <styled-content style="font-size:15px;">*(end-start))
maxbase &lt;- end + (</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.25</styled-content>
                        <styled-content style="font-size:15px;">*(end-start))</styled-content>
                    </preformat>
                </p>
                <p>Next, we will add some genomic annotations of interest such as the locations of CpG islands and DNAseI hypersensitive sites; this can be any feature or genomic annotation of interest that you have data available for. The CpG islands data was generated using the method published by 
                    <xref ref-type="bibr" rid="ref-41">Wu 
                        <italic toggle="yes">et al.</italic> (2010)</xref>; the DNaseI hypersensitive site data was obtained from the 
                    <ext-link ext-link-type="uri" xlink:href="https://genome.ucsc.edu/cgi-bin/hgTables">UCSC Genome Browser</ext-link>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># CpG islands</styled-content>

                        <styled-content style="font-size:15px;">islandHMM &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">read.csv</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">paste</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content>
                               
                        <styled-content style="font-size:15px;color:#4F9905;">"model-based-cpg-islands-hg19-chr17.txt"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"/"</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"\t"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">stringsAsFactors=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">header=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(islandHMM)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##                V1      V2     V3    V4   V5   V6    V7    V8
## 1 chr17_ctg5_hap1    8935  10075  1141  129  815 0.714 0.887
## 2 chr17_ctg5_hap1   64252  64478   227   30  165 0.727 1.014
## 3 chr17_ctg5_hap1   87730  89480  1751  135 1194 0.682 0.663
## 4 chr17_ctg5_hap1   98265  98591   327   29  226 0.691 0.744
## 5 chr17_ctg5_hap1  120763 125451  4689  359 3032 0.647 0.733
## 6 chr17_ctg5_hap1  146257 146607   351   19  231 0.658 0.500</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">islandData &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">GRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">seqnames=Rle</styled-content>
                        <styled-content style="font-size:15px;">(islandHMM[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">]),</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">ranges=IRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">start=</styled-content>
                        <styled-content style="font-size:15px;">islandHMM[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">end=</styled-content>
                        <styled-content style="font-size:15px;">islandHMM[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">]),</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">strand=Rle</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">strand</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rep</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"*"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;">(islandHMM)))))
islandData</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## GRanges    object with 3456 ranges and 0 metadata columns:
##                    seqnames               ranges strand
##                       &lt;Rle&gt;            &lt;IRanges&gt;  &lt;Rle&gt;
##      [1]    chr17_ctg5_hap1     [   8935, 10075]      *
##      [2]    chr17_ctg5_hap1     [  64252, 64478]      *
##      [3]    chr17_ctg5_hap1     [  87730, 89480]      *
##      [4]    chr17_ctg5_hap1     [  98265, 98591]      *
##      [5]    chr17_ctg5_hap1     [120763, 125451]      *
##      ...      ...                   ...   ...
##   [3452]              chr17 [81147380, 81147511]      *
##   [3453]              chr17 [81147844, 81148321]      *
##   [3454]              chr17 [81152612, 81153665]      *
##   [3455]              chr17 [81156194, 81156512]      *
##   [3456]              chr17 [81162945, 81165532]      *
##   -------
##   seqinfo: 5 sequence from an unspecified genome; no seqlengths</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># DNAseI hypersensitive sites</styled-content>

                        <styled-content style="font-size:15px;">dnase &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">read.csv</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">paste</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"wgEncodeRegDnaseClusteredV3chr17.bed"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                           
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"/"</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content>
                    
                        <styled-content style="font-size:15px;color:#214A87;">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"\t"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">stringsAsFactors=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">header=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">FALSE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(dnase)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##      V1   V2   V3 V4  V5 V6                                       V7
## 1 chr17  125  335  7 444  7                    84,83,88,90,77,87,89,
## 2 chr17  685  835  1 150  1                                      80,
## 3 chr17 2440 2675 13 410 13 0,30,102,104,38,47,61,68,122,1,51,73,75,
## 4 chr17 3020 3170  1 247  1                                     120,
## 5 chr17 3740 3890  2 161  2                                   71,73,
## 6 chr17 5520 6110  4 241  5                          17,19,25,16,16,
##                                               V8
## 1                   328,208,444,218,109,171,191,
## 2                                           150, 
## 3 204,410,301,206,46,48,84,164,85,12,98,215,146,
## 4                                           247,
## 5                                       108,161,
## 6                             241,185,239,26,52,
dnaseData &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">GRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">seqnames=</styled-content>
                        <styled-content style="font-size:15px;">dnase[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content>
		        
                        <styled-content style="font-size:15px;color:#214A87;">ranges=IRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">start=</styled-content>
                        <styled-content style="font-size:15px;">dnase[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">end=</styled-content>
                        <styled-content style="font-size:15px;">dnase[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">]),</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">strand=Rle</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rep</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"*"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;">(dnase))),</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">data=</styled-content>
                        <styled-content style="font-size:15px;">dnase[,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>])

                        <styled-content style="font-size:15px;">dnaseData

##  GRanges  object with 74282 ranges and 1 metadata column:
##	       seqnames	              ranges strand |	   data
##		  &lt;Rle&gt;	           &lt;IRanges&gt;  &lt;Rle&gt; | &lt;integer&gt;
##	   [1]	  chr17         [  125, 335]	  * |	    444
##	   [2]	  chr17         [  685, 835]	  * |	    150
##	   [3]	  chr17         [2440, 2675]	  * |	    410
##	   [4]	  chr17         [3020, 3170]	  * |	    247
##	   [5]	  chr17         [3740, 3890]	  * |	    161
##	   ...	    ...	                 ...    ... .	    ...
##     [74278]	  chr17 [81153140, 81153350]	  * |	    574
##     [74279]	  chr17 [81153580, 81153810]	  * |	    208
##     [74280]	  chr17 [81185540, 81185750]	  * |	    326
##     [74281]	  chr17 [81188880, 81189090]	  * |	    209
##     [74282]	  chr17 [81194900, 81195115]	  * |	    185
##     -------
##     seqinfo: 1 sequence from an unspecified genome; no seqlengths
</styled-content>
                    </preformat>
                </p>
                <p>Now, set up the ideogram, genome and RefSeq tracks that will provide context for our methylation data.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">iTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">IdeogramTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">genome =</styled-content> 
                        <styled-content style="font-size:15px;">gen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">chromosome =</styled-content> 
                        <styled-content style="font-size:15px;">chrom,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">""</styled-content>
                        <styled-content style="font-size:15px;">)
gTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">GenomeAxisTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"black"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">""</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">fontcolor=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"black"</styled-content>
                        <styled-content style="font-size:15px;">)
rTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">UcscTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">genome=</styled-content>
                        <styled-content style="font-size:15px;">gen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">chromosome=</styled-content>
                        <styled-content style="font-size:15px;">chrom,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">track=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"refGene"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">from=</styled-content>
                        <styled-content style="font-size:15px;">minbase,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">to=</styled-content>
                        <styled-content style="font-size:15px;">maxbase,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">trackType=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"GeneRegionTrack"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">rstarts=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"exonStarts"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">rends=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"exonEnds"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gene=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"name"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">symbol=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"name2"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">transcript=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"name"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">strand=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"strand"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"darkblue"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">stacking=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"squish"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"RefSeq"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        
                        <styled-content style="font-size:15px;color:#214A87;">showId=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">geneSymbol=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
</preformat>
                </p>
                <p>Ensure that the methylation data is ordered by chromosome and base position.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">ann450kOrd &lt;- ann450kSub[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">order</styled-content>
                        <styled-content style="font-size:15px;">(ann450kSub$chr,ann450kSub$pos),]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(ann450kOrd)


## DataFrame with 6 rows and 22 columns
##	              chr	pos	 strand	       Name	Probe_rs
##	      &lt;character&gt; &lt;integer&gt; &lt;character&gt; &lt;character&gt;  &lt;character&gt;
## cg13869341	     chr1     15865	      +	 cg13869341	      NA
## cg24669183	     chr1    534242	      -	 cg24669183    rs6680725
## cg15560884	     chr1    710097           +	 cg15560884	      NA
## cg01014490        chr1    714177	      -	 cg01014490	      NA
## cg17505339	     chr1    720865	      -	 cg17505339	      NA
## cg11954957	     chr1    758829	      +	 cg11954957  rs115498424
##	      Probe_maf      CpG_rs   CpG_maf      SBE_rs     SBE_maf
##	      &lt;numeric&gt; &lt;character&gt; &lt;numeric&gt; &lt;character&gt;   &lt;numeric&gt;
## cg13869341	     NA	         NA	   NA	       NA	   NA
## cg24669183  0.108100	         NA	   NA	       NA	   NA
## cg15560884	     NA	         NA	   NA	       NA	   NA
## cg01014490	     NA	         NA	   NA	       NA	   NA
## cg17505339	     NA	         NA	   NA	       NA	   NA
## cg11954957  0.029514	         NA	   NA	       NA	   NA
##	            Islands_Name Relation_to_Island UCSC_RefGene_Name
##                   &lt;character&gt;	&lt;character&gt;	  &lt;character&gt;
## cg13869341	                            OpenSea	       WASH5P
## cg24669183  chr1:533219-534114	    S_Shore
## cg15560884  chr1:713984-714547	    N_Shelf
## cg01014490  chr1:713984-714547	     Island
## cg17505339	                            OpenSea
## cg11954957  chr1:762416-763445	    N_Shelf
##	       UCSC_RefGene_Accession UCSC_RefGene_Group     Phantom
##	                  &lt;character&gt;	     &lt;character&gt; &lt;character&gt;
## cg13869341	            NR_024540	            Body
## cg24669183
## cg15560884
## cg01014490
## cg17505339
## cg11954957 
##	              DMR      Enhancer      HMM_Island Regulatory_Feature_Name
##	      &lt;character&gt;   &lt;character&gt;	    &lt;character&gt;	            &lt;character&gt;
## cg13869341
## cg24669183	                        1:523025-524193
## cg15560884
## cg01014490	                        1:703784-704410	        1:713802-715219
## cg17505339
## cg11954957
##	      Regulatory_Feature_Group	       DHS
##	                   &lt;character&gt; &lt;character&gt;
## cg13869341
## cg24669183
## cg15560884
## cg01014490	  Promoter_Associated
## cg17505339
## cg11954957


bValsOrd &lt;- bVals[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">match</styled-content>
                        <styled-content style="font-size:15px;">(ann450kOrd$Name,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rownames</styled-content>
                        <styled-content style="font-size:15px;">(bVals)),]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">head</styled-content>
                        <styled-content style="font-size:15px;">(bValsOrd)


##               naive.1    rTreg.2 act_naive.3    naive.4 act_naive.5
## cg13869341 0.84267937 0.85118462   0.8177504 0.82987650  0.81186174
## cg24669183 0.81812908 0.82489238   0.8293297 0.75610281  0.81967323
## cg15560884 0.77219626 0.74903910   0.7516263 0.77417882  0.77266205
## cg01014490 0.08098986 0.06590459   0.0233755 0.04127262  0.04842397
## cg17505339 0.89439216 0.93822870   0.9471357 0.90520570  0.92641305
## cg11954957 0.74495496 0.79008516   0.7681146 0.84450764  0.75431167
##            act_rTreg.6   naive.7    rTreg.8 act_naive.9 act_rTreg.10
## cg13869341   0.8090798 0.8891851 0.88537940  0.90916748   0.88334231
## cg24669183   0.8187838 0.7903763 0.85304116  0.80930568   0.80979554
## cg15560884   0.7721528 0.7658623 0.75909061  0.78099397   0.78569274
## cg01014490   0.0644404 0.0245281 0.02832358  0.07740468   0.04640659
## cg17505339   0.9286016 0.8889361 0.87205348  0.90099782   0.93508348
## cg11954957   0.8116911 0.7832207 0.84929777  0.84719430   0.83350220
</styled-content>
</preformat>
                </p>
                <p>Create the data tracks using the appropriate track type for each data type.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903"># create genomic ranges object from methylation data</styled-content>

                        <styled-content style="font-size:15px;">cpgData &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">GRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">seqnames=Rle</styled-content>
                        <styled-content style="font-size:15px;">(ann450kOrd$chr),</styled-content>
                     
                        <styled-content style="font-size:15px;color:#214A87;">ranges=IRanges</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">start=</styled-content>
                        <styled-content style="font-size:15px;">ann450kOrd$pos,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">end=</styled-content>
                        <styled-content style="font-size:15px;">ann450kOrd$pos),</styled-content>
                     
                        <styled-content style="font-size:15px;color:#214A87;">strand=Rle</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">rep</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"*"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">nrow</styled-content>
                        <styled-content style="font-size:15px;">(ann450kOrd))),</styled-content>
                     
                        <styled-content style="font-size:15px;color:#214A87;">betas=</styled-content>
                        <styled-content style="font-size:15px;">bValsOrd)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># extract data on CpGs in DMR</styled-content>

                        <styled-content style="font-size:15px;">cpgData &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">subsetByOverlaps</styled-content>
                        <styled-content style="font-size:15px;">(cpgData, results.ranges[dmrIndex])</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># methylation data track</styled-content>

                        <styled-content style="font-size:15px;">methTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DataTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">range=</styled-content>
                        <styled-content style="font-size:15px;">cpgData,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">groups=</styled-content>
                        <styled-content style="font-size:15px;">targets$Sample_Group,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">genome =</styled-content> 
                        <styled-content style="font-size:15px;">gen,</styled-content>
                          
                        <styled-content style="font-size:15px;color:#214A87;">chromosome=</styled-content>
                        <styled-content style="font-size:15px;">chrom,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylim=c</styled-content>
                        <styled-content style="font-size:15px;">(-</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1.05</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">pal,</styled-content>
                          
                        <styled-content style="font-size:15px;color:#214A87;">type=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"a"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"p"</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DNA Meth.\n(beta value)"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                          
                        <styled-content style="font-size:15px;color:#214A87;">background.panel=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"white"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex.title=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                          
                        <styled-content style="font-size:15px;color:#214A87;">cex.axis=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">cex.legend=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.8</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># CpG island track</styled-content>

                        <styled-content style="font-size:15px;">islandTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">AnnotationTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">range=</styled-content>
                        <styled-content style="font-size:15px;">islandData,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">genome=</styled-content>
                        <styled-content style="font-size:15px;">gen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"CpG Is."</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                                   
                        <styled-content style="font-size:15px;color:#214A87;">chromosome=</styled-content>
                        <styled-content style="font-size:15px;">chrom,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"darkgreen"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># DNaseI hypersensitive site data track</styled-content>

                        <styled-content style="font-size:15px;">dnaseTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">DataTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">range=</styled-content>
                        <styled-content style="font-size:15px;">dnaseData,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">genome=</styled-content>
                        <styled-content style="font-size:15px;">gen,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DNAseI"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                           
                        <styled-content style="font-size:15px;color:#214A87;">type=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"gradient"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">chromosome=</styled-content>
                        <styled-content style="font-size:15px;">chrom)</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># DMR position data track</styled-content>

                        <styled-content style="font-size:15px;">dmrTrack &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">AnnotationTrack</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">start=</styled-content>
                        <styled-content style="font-size:15px;">start,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">end=</styled-content>
                        <styled-content style="font-size:15px;">end,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">genome=</styled-content>
                        <styled-content style="font-size:15px;">gen,</styled-content>  
                        <styled-content style="font-size:15px;color:#214A87;">name=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"DMR"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                               
                        <styled-content style="font-size:15px;color:#214A87;">chromosome=</styled-content>
                        <styled-content style="font-size:15px;">chrom,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"darkred"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>Set up the track list and indicate the relative sizes of the different tracks. Finally, draw the plot using the 
                    <monospace>plotTracks</monospace> function (
                    <xref ref-type="fig" rid="f11">Figure 11</xref>).</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">tracks &lt;-</styled-content>  
                        <styled-content style="font-size:15px;color:#214A87;">list</styled-content>
                        <styled-content style="font-size:15px;">(iTrack, gTrack, methTrack, dmrTrack, islandTrack, dnaseTrack,
		
                            <styled-content style="font-size:15px;">rTrack)</styled-content>
sizes &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>  
                        <styled-content style="font-size:15px;color:#8F5903"># set up the relative sizes of the tracks</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">plotTracks</styled-content>
                        <styled-content style="font-size:15px;">(tracks,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">from=</styled-content>
                        <styled-content style="font-size:15px;">minbase,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">to=</styled-content>
                        <styled-content style="font-size:15px;">maxbase,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">showTitle=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">add53=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
            
                        <styled-content style="font-size:15px;color:#214A87;">add35=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">grid=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">lty.grid=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">sizes=</styled-content>
                        <styled-content style="font-size:15px;">sizes,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">length</styled-content>
                        <styled-content style="font-size:15px;">(tracks))</styled-content>
                    </preformat>
                </p>
            </sec>
        </sec>
        <sec>
            <title>Additional analyses</title>
            <sec>
                <title>Gene ontology testing</title>
                <p>Once you have performed a differential methylation analysis, there may be a very long list of significant CpG sites to interpret. One question a researcher may have is, &#x201c;which gene pathways are over-represented for differentially methylated CpGs?&#x201d; In some cases it is relatively straightforward to link the top differentially methylated CpGs to genes that make biological sense in terms of the cell types or samples being studied, but there may be many thousands of CpGs significantly differentially methylated. In order to gain an understanding of the biological processes that the differentially methylated CpGs may be involved in, we can perform gene ontology or KEGG pathway analysis using the 
                    <monospace>gometh</monospace> function in the 
                    <italic toggle="yes">missMethyl</italic> package (
                    <xref ref-type="bibr" rid="ref-28">Phipson 
                        <italic toggle="yes">et al.</italic>, 2016</xref>).</p>
                <p>Let us consider the first comparison, naive vs rTreg, with the results of the analysis in the 
                    <monospace>DMPs</monospace> table. The 
                    <monospace>gometh</monospace> function takes as input a character vector of the names (e.g. cg20832020) of the significant CpG sites, and optionally, a character vector of all CpGs tested. This is recommended particularly if extensive filtering of the CpGs has been performed prior to analysis. For gene ontology testing (default), the user can specify 
                    <monospace>collection="GO&#x201d;</monospace>. For testing KEGG pathways, specify 
                    <monospace>collection="KEGG&#x201d;</monospace>. In the 
                    <monospace>DMPs</monospace> table, the 
                    <monospace>Name</monospace> column corresponds to the CpG name. We will select all CpG sites that have adjusted p-value of less than 0.05.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># Get the significant CpG sites at less than 5% FDR</styled-content>

                        <styled-content style="font-size:15px;">sigCpGs &lt;- DMPs$Name[DMPs$adj.P.Val&lt;</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0.05</styled-content>
                        <styled-content style="font-size:15px;">]</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># First 10 significant CpGs</styled-content>

                        <styled-content style="font-size:15px;">sigCpGs[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">10</styled-content>
                        <styled-content style="font-size:15px;">]


##  [1] "cg07499259" "cg26992245" "cg09747445" "cg18808929" "cg25015733"
##  [6] "cg21179654" "cg26280976" "cg16943019" "cg10898310" "cg25130381"
</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># Total number of significant CpGs at 5% FDR</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">length</styled-content>
                        <styled-content style="font-size:15px;">(sigCpGs)</styled-content>


                        <styled-content style="font-size:15px;">## [1] 3021</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># Get all the CpG sites used in the analysis to form the background</styled-content>

                        <styled-content style="font-size:15px;">all &lt;- DMPs$Name</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># Total number of CpG sites tested</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">length</styled-content>
                        <styled-content style="font-size:15px;">(all)


## [1] 439918</styled-content>
</preformat>
                </p>
                <p>The 
                    <monospace>gometh</monospace> function takes into account the varying numbers of CpGs associated with each gene on the Illumina methylation arrays. For the 450k array, the numbers of CpGs mapping to genes can vary from as few as 1 to as many as 1200. The genes that have more CpGs associated with them will have a higher probability of being identified as differentially methylated compared to genes with fewer CpGs. We can look at this bias in the data by specifying 
                    <monospace>plot=TRUE</monospace> in the call to 
                    <monospace>gometh</monospace> (
                    <xref ref-type="fig" rid="f12">Figure 12</xref>).</p>
                <fig fig-type="figure" id="f12" orientation="portrait" position="float">
                    <label>Figure 12. </label>
                    <caption>
                        <title>Bias resulting from different numbers of CpG probes in different genes.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure12.gif"/>
                </fig>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">))
gst &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">gometh</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">sig.cpg=</styled-content>
                        <styled-content style="font-size:15px;">sigCpGs,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">all.cpg=</styled-content>
                        <styled-content style="font-size:15px;">all,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">plot.bias=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903">TRUE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>



                        <styled-content style="font-size:15px;">## Warning in alias2SymbolTable(flat$symbol): Multiple symbols ignored for one
## or more aliases</styled-content>
                    </preformat>
                </p>
                <p>The 
                    <monospace>gst</monospace> object is a 
                    <monospace>data.frame</monospace> with each row corresponding to the GO category being tested. Note that the warning regarding multiple symbols will always be displayed as there are genes that have more than one alias, however it is not a cause for concern.</p>
                <p>The top 20 gene ontology categories can be displayed using the 
                    <monospace>topGO</monospace> function. For KEGG pathway analysis, the 
                    <monospace>topKEGG</monospace> function can be called to display the top 20 enriched pathways.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">

                        <styled-content style="font-size:15px;color:#8F5903"># Top 10 GO categories</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">topGO</styled-content>
                        <styled-content style="font-size:15px;">(gst,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">number=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">10</styled-content>
                        <styled-content style="font-size:15px;">)


##	                                     Term Ont    N  DE
## GO:0002376	            immune system process  BP 2240 340
## GO:0006955	                  immune response  BP 1409 212
## GO:0001775	                  cell activation  BP  837 158
## GO:0007159	     leukocyte cell-cell adhesion  BP  455 103
## GO:0046649	            lymphocyte activation  BP  574 119
## GO:0002682 regulation of immune system process  BP 1225 195
## GO:0045321	             leukocyte activation  BP  676 130
## GO:0070486	            leukocyte aggregation  BP  423  96
## GO:0042110	                T cell activation  BP  415  94 
## GO:0070489	               T cell aggregation  BP  415  94
##	                                        P.DE
## GO:0002376 0.000000000000000000000000000003229702
## GO:0006955 0.000000000000000000000422272703517178
## GO:0001775 0.000000000000000000010295538258512461
## GO:0007159 0.000000000000000000090040070213398411
## GO:0046649 0.000000000000000000250620553154991038
## GO:0002682 0.000000000000000000263741544330346864
## GO:0045321 0.000000000000000001995676602987099282
## GO:0070486 0.000000000000000002407114373683902864
## GO:0042110 0.000000000000000004264084670812330066
## GO:0070489 0.000000000000000004264084670812330066
##                                           FDR
## GO:0002376 0.00000000000000000000000006820484
## GO:0006955 0.00000000000000000445877747643788
## GO:0001775 0.00000000000000007247372564775539
## GO:0007159 0.00000000000000047536655069163696
## GO:0046649 0.00000000000000092828232219471078
## GO:0002682 0.00000000000000092828232219471078
## GO:0045321 0.00000000000000602067121455450880
## GO:0070486 0.00000000000000635418016793208260
## GO:0042110 0.00000000000000900489400782147891
## GO:0070489 0.00000000000000900489400782147891</styled-content>
                    </preformat>
                </p>
                <p>From the output we can see many of the top GO categories correspond to immune system and T cell processes, which is unsurprising as the cell types being studied form part of the immune system. Typically, we consider GO categories that have associated false discovery rates of less than 5% to be statistically significant. If there aren&#x2019;t any categories that achieve this significance it may be useful to scan the top 5 or 10 highly ranked GO categories to gain some insight into the biological system.</p>
                <p>The 
                    <monospace>gometh</monospace> function only tests GO and KEGG pathways. For a more generalised version of gene set testing for methylation data where the user can specify the gene set to be tested, the 
                    <monospace>gsameth</monospace> function can be used. To display the top 20 pathways, 
                    <monospace>topGSA</monospace> can be called. 
                    <monospace>gsameth</monospace> accepts a single gene set, or a list of gene sets. The gene identifiers in the gene set must be Entrez Gene IDs. To demonstrate 
                    <monospace>gsameth</monospace>, we are using the curated genesets (C2) from the Broad Institute Molecular signatures 
                    <ext-link ext-link-type="uri" xlink:href="http://software.broadinstitute.org/gsea/msigdb">database</ext-link>. These can be downloaded as an 
                    <monospace>RData</monospace> object from the WEHI Bioinformatics 
                    <ext-link ext-link-type="uri" xlink:href="http://bioinf.wehi.edu.au/software/MSigDB/">website</ext-link>.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># load Broad human curated (C2) gene sets</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">load</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">paste</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"human_c2_v5.rdata"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"/"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># perform the gene set test(s)</styled-content>

                        <styled-content style="font-size:15px;">gsa &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">gsameth</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">sig.cpg=</styled-content>
                        <styled-content style="font-size:15px;">sigCpGs,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">all.cpg=</styled-content>
                        <styled-content style="font-size:15px;">all,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">collection=</styled-content>
                        <styled-content style="font-size:15px;">Hs.c2)</styled-content>
</preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## Warning in alias2SymbolTable(flat$symbol): Multiple symbols ignored for one
## or more aliases</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># top 10 gene sets</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">topGSA</styled-content>
                        <styled-content style="font-size:15px;">(gsa,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">number=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">10</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##                                         N  DE                       P.DE
## REACTOME_IMMUNE_SYSTEM                933 127 0.000000000000000000000000
## DACOSTA_UV_RESPONSE_VIA_ERCC3_DN      855 147 0.000000000000000000000000
## ZHENG_BOUND_BY_FOXP3                  491 138 0.000000000000000000000000
## MARSON_BOUND_BY_FOXP3_UNSTIMULATED   1229 169 0.000000000000000000000000
## CHEN_METABOLIC_SYNDROM_NETWORK       1210 162 0.000000000000000000000000
## MARTENS_BOUND_BY_PML_RARA_FUSION      456 105 0.000000000000000000000000
## PILON_KLF1_TARGETS_DN                1972 262 0.000000000000000000000000
## JAATINEN_HEMATOPOIETIC_STEM_CELL_DN   226  59 0.000000000000000004151028
## SMID_BREAST_CANCER_NORMAL_LIKE_UP     476  92 0.000000000000002674784772
## LEE_EARLY_T_LYMPHOCYTE_DN              57  25 0.000000000000825660086765
##                                                          FDR
## REACTOME_IMMUNE_SYSTEM               0.000000000000000000000
## DACOSTA_UV_RESPONSE_VIA_ERCC3_DN     0.000000000000000000000
## ZHENG_BOUND_BY_FOXP3                 0.000000000000000000000
## MARSON_BOUND_BY_FOXP3_UNSTIMULATED   0.000000000000000000000
## CHEN_METABOLIC_SYNDROM_NETWORK       0.000000000000000000000
## MARTENS_BOUND_BY_PML_RARA_FUSION     0.000000000000000000000
## PILON_KLF1_TARGETS_DN                0.000000000000000000000
## JAATINEN_HEMATOPOIETIC_STEM_CELL_DN  0.000000000000002451701
## SMID_BREAST_CANCER_NORMAL_LIKE_UP    0.000000000001404262005
## LEE_EARLY_T_LYMPHOCYTE_DN            0.000000000390124390997</styled-content>
                    </preformat>
                </p>
                <p>While gene set testing is useful for providing some biological insight in terms of what pathways might be affected by abberant methylation, care should be taken not to over-interpret the results. Gene set testing should be used for the purpose of providing some biological insight that ideally would be tested and validated in further laboratory experiments. It is important to keep in mind that we are not observing gene level activity such as in RNA-Seq experiments, and that we have had to take an extra step to associate CpGs with genes.</p>
            </sec>
            <sec id="DV">
                <title>Differential variability</title>
                <p>Rather than testing for differences in mean methylation, we may be interested in testing for differences between group variances. For example, it has been hypothesised that highly variable CpGs in cancer may contribute to tumour heterogeneity (
                    <xref ref-type="bibr" rid="ref-13">Hansen 
                        <italic toggle="yes">et al.</italic>, 2011</xref>). Hence we may be interested in CpG sites that are consistently methylated in one group, but variably methylated in another group.</p>
                <p>Sample size is an important consideration when testing for differentially variable CpG sites. In order to get an accurate estimate of the group variances, larger sample sizes are required than for estimating group means. A good rule of thumb is to have at least ten samples in each group (
                    <xref ref-type="bibr" rid="ref-29">Phipson &amp; Oshlack, 2014</xref>). To demonstrate testing for differentially variable CpG sites, we will use a publicly available dataset on ageing GSE30870, where whole blood samples were collected from 18 centenarians and 18 newborns and profiled for methylation on the 450k array (
                    <xref ref-type="bibr" rid="ref-14">Heyn 
                        <italic toggle="yes">et al.</italic>, 2012</xref>). The data (
                    <monospace>age.rgSet</monospace>) and sample information (
                    <monospace>age.targets</monospace>) have been included as an R data object in the data archive you previously downloaded from 
                    <ext-link ext-link-type="uri" xlink:href="https://figshare.com/articles/methylAnalysisDataV3_tar_gz/4800970">figshare</ext-link>. We can load the data using the 
                    <monospace>load</monospace> command, after which it needs to be normalised and filtered as previously described.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># load data</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">load</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">paste</styled-content>
                        <styled-content style="font-size:15px;">(dataDirectory,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"ageData.RData"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">sep=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"/"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># calculate detection p-values</styled-content>

                        <styled-content style="font-size:15px;">age.detP &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">detectionP</styled-content>
                        <styled-content style="font-size:15px;">(age.rgSet)</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># pre-process the data after excluding poor quality samples</styled-content>

                        <styled-content style="font-size:15px;">age.mSetSq &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">preprocessQuantile</styled-content>
                        <styled-content style="font-size:15px;">(age.rgSet)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## [preprocessQuantile] Mapping to genome.</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## [preprocessQuantile] Fixing outliers.</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## [preprocessQuantile] Quantile normalizing.</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># add sex information to targets information</styled-content>

                        <styled-content style="font-size:15px;">age.targets$Sex &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">getSex</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSq)$predictedSex</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># ensure probes are in the same order in the mSetSq and detP objects</styled-content>

                        <styled-content style="font-size:15px;">age.detP &lt;- age.detP[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">match</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSq),</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">rownames</styled-content>
                        <styled-content style="font-size:15px;">(age.detP)),]</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># remove poor quality probes</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">rowSums</styled-content>
                        <styled-content style="font-size:15px;">(age.detP &lt;</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF">0.01</styled-content>
                        <styled-content style="font-size:15px;">) ==</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">ncol</styled-content>
                        <styled-content style="font-size:15px;">(age.detP)
age.mSetSqFlt &lt;- age.mSetSq[keep,]</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># remove probes with SNPs at CpG or single base extension (SBE) site</styled-content>

                        <styled-content style="font-size:15px;">age.mSetSqFlt &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">dropLociWithSnps</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">snps = c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"CpG"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905">"SBE"</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># remove cross-reactive probes</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;- !(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt) %in% xReactiveProbes$TargetID)
age.mSetSqFlt &lt;- age.mSetSqFlt[keep,]</styled-content>
                    </preformat>
                </p>
                <p>As this dataset contains samples from both males and females, we can use it to demonstrate the effect of removing sex chromosome probes on the data. The MDS plots below show the relationship between the samples in the ageing dataset before and after sex chromosome probe removal (
                    <xref ref-type="fig" rid="f13">Figure 13</xref>). It is apparent that before the removal of sex chromosome probes, the sample cluster based on sex in the second principal component. When the sex chromosome probes are removed, age is the largest source of variation present and the male and female samples no longer form separate clusters.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># tag sex chromosome probes for removal</styled-content>

                        <styled-content style="font-size:15px;">keep &lt;- !(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">featureNames</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt) %in% ann450k$Name[ann450k$chr %in%</styled-content>
								   
                        <styled-content style="font-size:15px;color:#214A87"> c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"chrX"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"chrY"</styled-content>
                        <styled-content style="font-size:15px;">)])
age.pal &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">brewer.pal</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">8</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"Set1"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">getM</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
	 
                        <styled-content style="font-size:15px;color:#214A87">col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">factor</styled-content>
                        <styled-content style="font-size:15px;">(age.targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">labels=</styled-content>
                        <styled-content style="font-size:15px;">age.targets$Sex,</styled-content>
	 
                        <styled-content style="font-size:15px;color:#214A87">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"With Sex CHR Probes"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">factor</styled-content>
                        <styled-content style="font-size:15px;">(age.targets$Sample_Group)),</styled-content>
	
                        <styled-content style="font-size:15px;color:#214A87">text.col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal)</styled-content>


                        <styled-content style="font-size:15px;color:#214A87">plotMDS</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">getM</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt[keep,]),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">top=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">1000</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">gene.selection=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"common"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
	 
                        <styled-content style="font-size:15px;color:#214A87">col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal[</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">factor</styled-content>
                        <styled-content style="font-size:15px;">(age.targets$Sample_Group)],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">labels=</styled-content>
                        <styled-content style="font-size:15px;">age.targets$Sex,</styled-content>
	 
                        <styled-content style="font-size:15px;color:#214A87">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"Without Sex CHR Probes"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905">"top"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">legend=levels</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">factor</styled-content>
                        <styled-content style="font-size:15px;">(age.targets$Sample_Group)),</styled-content>
	
                        <styled-content style="font-size:15px;color:#214A87">text.col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># remove sex chromosome probes from data</styled-content>

                        <styled-content style="font-size:15px;">age.mSetSqFlt &lt;- age.mSetSqFlt[keep,]</styled-content>
                    </preformat>
                </p>
                <fig fig-type="figure" id="f13" orientation="portrait" position="float">
                    <label>Figure 13. </label>
                    <caption>
                        <title>When samples from both males and females are included in a study, sex is usually the largest source of variation in methylation data.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure13.gif"/>
                </fig>
                <p>We can test for differentially variable CpGs using the 
                    <monospace>varFit</monospace> function in the 
                    <italic toggle="yes">missMethyl</italic> package. The syntax for specifying which groups we are interested in testing is slightly different to the standard way a model is specified in 
                    <monospace>limma</monospace>, particularly for designs where an intercept is fitted (see 
                    <italic toggle="yes">missMethyl</italic> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/release/bioc/vignettes/missMethyl/inst/doc/missMethyl.pdf">vignette</ext-link> for further details). For the ageing data, the design matrix includes an intercept term, and a term for age. The 
                    <monospace>coef</monospace> argument in the 
                    <monospace>varFit</monospace> function indicates which columns of the design matrix correspond to the intercept and grouping factor. Thus, for the ageing dataset we set 
                    <monospace>coef=c(1,2)</monospace>. Note that design matrices without intercept terms are permitted, with specific contrasts tested using the 
                    <monospace>contrasts.varFit</monospace> function.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># get M-values for analysis</styled-content>

                        <styled-content style="font-size:15px;">age.mVals &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">getM</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt)

design &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">model.matrix</styled-content>
                        <styled-content style="font-size:15px;">(~</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">factor</styled-content>
                        <styled-content style="font-size:15px;">(age.targets$Sample_Group))</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># Fit the model for differential variability
# specifying the intercept and age as the grouping factor</styled-content>

                        <styled-content style="font-size:15px;">fitvar &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">varFit</styled-content>
                        <styled-content style="font-size:15px;">(age.mVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">design =</styled-content> 
                        <styled-content style="font-size:15px;">design,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">coef = c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>


                        <styled-content style="font-size:15px;color:#8F5903"># Summary of differential variability</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">summary</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">decideTests</styled-content>
                        <styled-content style="font-size:15px;">(fitvar))</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">##    (Intercept) factor(age.targets$Sample_Group)OLD
## -1 		0 				 1325
## 0 	    11441 			       393451
## 1 	   417787 				34452</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">topDV &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">topVar</styled-content>
                        <styled-content style="font-size:15px;">(fitvar,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">coef=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">2</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903"># Top 10 differentially variable CpGs between old vs. newborns</styled-content>

                        <styled-content style="font-size:15px;">topDV</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">## 	      SampleVar LogVarRatio DiffLevene         t 	    P.Value
## cg19078576 1.1128910    3.746586  0.8539180  7.006476 0.0000000006234780
## cg11661000 0.5926226    3.881306  0.8413614  6.945711 0.0000000008176807
## cg07065220 1.0111380    4.181802  0.9204407  6.840327 0.0000000013069867
## cg05995465 1.4478673   -5.524284 -1.3035981 -6.708321 0.0000000023462074
## cg18091046 1.1121511    3.564282  1.0983340  6.679920 0.0000000026599570
## cg05491001 0.9276904    3.869760  0.7118591  6.675892 0.0000000027077013
## cg05542681 1.0287320    3.783637  0.9352814  6.635588 0.0000000032347355
## cg02726803 0.3175570    4.063650  0.6418968  6.607508 0.0000000036608219
## cg08362283 1.0028907    4.783899  0.6970960  6.564472 0.0000000044240941
## cg18160402 0.5624192    3.716228  0.5907985  6.520508 0.0000000053665347
## 	       Adj.P.Value
## cg19078576 0.0001754857
## cg11661000 0.0001754857
## cg07065220 0.0001869984
## cg05995465 0.0001937035
## cg18091046 0.0001937035
## cg05491001 0.0001937035
## cg05542681 0.0001964159
## cg02726803 0.0001964159
## cg08362283 0.0002109939
## cg18160402 0.0002303467</styled-content>
                    </preformat>
                </p>
                <p>Similarly to the differential methylation analysis, is it useful to plot sample-wise beta values for the differentially variable CpGs to ensure the significant results are not driven by artifacts or outliers (
                    <xref ref-type="fig" rid="f14">Figure 14</xref>).</p>
                <fig fig-type="figure" id="f14" orientation="portrait" position="float">
                    <label>Figure 14. </label>
                    <caption>
                        <title>As for DMPs, it is useful to plot the top few differentially variable CpGs to check that the results make sense.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure14.gif"/>
                </fig>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903"># get beta values for ageing data</styled-content>

                        <styled-content style="font-size:15px;">age.bVals &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">getBeta</styled-content>
                        <styled-content style="font-size:15px;">(age.mSetSqFlt)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#214A87">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">2</styled-content>,
                        <styled-content style="font-size:15px;color:#0000CF">2</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;color:#214A87">sapply</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87">rownames</styled-content>
                        <styled-content style="font-size:15px;">(topDV)[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF">1</styled-content>:
                        <styled-content style="font-size:15px;color:#0000CF">4</styled-content>
                        <styled-content style="font-size:15px;">], function(cpg){</styled-content>
  
                        <styled-content style="font-size:15px;color:#214A87">plotCpg</styled-content>
                        <styled-content style="font-size:15px;">(age.bVals,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">cpg=</styled-content>
                        <styled-content style="font-size:15px;">cpg,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87">pheno=</styled-content>
                        <styled-content style="font-size:15px;">age.targets$Sample_Group,</styled-content>
	   
                        <styled-content style="font-size:15px;color:#214A87">ylab =</styled-content> 
                        <styled-content style="font-size:15px;color:#4F9905">"Beta values"</styled-content>
                        <styled-content style="font-size:15px;">)
})</styled-content>
                    </preformat>
                </p>
                <p>An example of testing for differential variability when the design matrix does not have an intercept term is detailed in the 
                    <italic toggle="yes">missMethyl</italic> 
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/release/bioc/vignettes/missMethyl/inst/doc/missMethyl.pdf/">vignette</ext-link>.</p>
            </sec>
            <sec>
                <title>Cell type composition</title>
                <p>As methylation is cell type specific and methylation arrays provide CpG methylation values for a population of cells, biological findings from samples that are comprised of a mixture of cell types, such as blood, can be confounded with cell type composition (
                    <xref ref-type="bibr" rid="ref-18">Jaffe &amp; Irizarry, 2014</xref>). The 
                    <italic toggle="yes">minfi</italic> function 
                    <monospace>estimateCellCounts</monospace> facilitates the estimation of the level of confounding between phenotype and cell type composition in a set of samples. The function uses a modified version of the method published by 
                    <xref ref-type="bibr" rid="ref-16">Houseman 
                        <italic toggle="yes">et al.</italic> (2012)</xref> and the package 
                    <monospace>FlowSorted.Blood.450k</monospace>, which contains 450k methylation data from sorted blood cells, to estimate the cell type composition of blood samples.</p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># load sorted blood cell data package</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">library</styled-content>
                        <styled-content style="font-size:15px;">(FlowSorted.Blood.450k)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># ensure that the "Slide" column of the rgSet pheno data is numeric
# to avoid "estimateCellCounts" error</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">pData</styled-content>
                        <styled-content style="font-size:15px;">(age.rgSet)$Slide &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">as.numeric</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">pData</styled-content>
                        <styled-content style="font-size:15px;">(age.rgSet)$Slide)</styled-content>

                        <styled-content style="font-size:15px;color:#8F5903;"># estimate cell counts</styled-content>

                        <styled-content style="font-size:15px;">cellCounts &lt;-</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">estimateCellCounts</styled-content>
                        <styled-content style="font-size:15px;">(age.rgSet)</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;">
                            <italic toggle="yes">## [estimateCellCounts] Combining user data with reference (flow sorted) data.</italic>

## [estimateCellCounts] Processing user and reference data together.

## [preprocessQuantile] Mapping to genome.

## [preprocessQuantile] Fixing outliers.

## [preprocessQuantile] Quantile normalizing.

## [estimateCellCounts] Picking probes for composition estimation.

## [estimateCellCounts] Estimating composition.</styled-content>
                    </preformat>
                </p>
                <p>
                    <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                        <styled-content style="font-size:15px;color:#8F5903;"># plot cell type proportions by age</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">par</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#214A87;">mfrow=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">))</styled-content>

                        <styled-content style="font-size:15px;">a = cellCounts[age.targets$Sample_Group == </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"NewBorns"</styled-content>
                        <styled-content style="font-size:15px;">,]</styled-content>

                        <styled-content style="font-size:15px;">b = cellCounts[age.targets$Sample_Group == </styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"OLD"</styled-content>
                        <styled-content style="font-size:15px;">,]</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">boxplot</styled-content>
                        <styled-content style="font-size:15px;">(a,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">at=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">*</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content> 
                        <styled-content style="font-size:15px;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">xlim=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">18</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylim=range</styled-content>
                        <styled-content style="font-size:15px;">(a, b),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">xaxt=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"n"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
         
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">],</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">main=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">""</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">ylab=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"Cell type proportion"</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">boxplot</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;">b</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">at=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">*</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content> 
                        <styled-content style="font-size:15px;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">xaxt=</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"n"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">add=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">col=</styled-content>
                        <styled-content style="font-size:15px;">age.pal</styled-content>
                        <styled-content style="font-size:15px;">[</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">2</styled-content>
                        <styled-content style="font-size:15px;">])</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">axis</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">1</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">at=</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">0</styled-content>
                        <styled-content style="font-size:15px;">:</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">5</styled-content>
                        <styled-content style="font-size:15px;">*</styled-content>
                        <styled-content style="font-size:15px;color:#0000CF;">3</styled-content> 
                        <styled-content style="font-size:15px;">+</styled-content> 
                        <styled-content style="font-size:15px;color:#0000CF;">1.5</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">labels=colnames</styled-content>
                        <styled-content style="font-size:15px;">(a),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">tick=</styled-content>
                        <styled-content style="font-size:15px;color:#8F5903;">TRUE</styled-content>
                        <styled-content style="font-size:15px;">)</styled-content>

                        <styled-content style="font-size:15px;color:#214A87;">legend</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"topleft"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">legend=c</styled-content>
                        <styled-content style="font-size:15px;">(</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"NewBorns"</styled-content>
                        <styled-content style="font-size:15px;">,</styled-content>
                        <styled-content style="font-size:15px;color:#4F9905;">"OLD"</styled-content>
                        <styled-content style="font-size:15px;">),</styled-content> 
                        <styled-content style="font-size:15px;color:#214A87;">fill=</styled-content>
                        <styled-content style="font-size:15px;">age.pal)</styled-content>
                    </preformat>
                </p>
                <p>As reported by 
                    <xref ref-type="bibr" rid="ref-18">Jaffe &amp; Irizarry (2014)</xref>, the preceding plot demonstrates that differences in blood cell type proportions are strongly confounded with age in this dataset (
                    <xref ref-type="fig" rid="f15">Figure 15</xref>). Performing cell composition estimation can alert you to potential issues with confounding when analysing a mixed cell type dataset. Based on the results, some type of adjustment for cell type composition may be considered, although a naive cell type adjustment is not recommended. 
                    <xref ref-type="bibr" rid="ref-18">Jaffe &amp; Irizarry (2014)</xref> outline several strategies for dealing with cell type composition issues.</p>
                <fig fig-type="figure" id="f15" orientation="portrait" position="float">
                    <label>Figure 15. </label>
                    <caption>
                        <title>If samples come from a population of mixed cells e.g. blood, it is advisable to check for potential confounding between differences in cell type proportions and the factor of interest.</title>
                    </caption>
                    <graphic orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/12222/f973945b-f2ea-4590-853a-525abd9237c8_figure15.gif"/>
                </fig>
            </sec>
        </sec>
        <sec sec-type="discussion">
            <title>Discussion</title>
            <p>Here we present a commonly used workflow for methylation array analysis based on a series of Bio-conductor packages. While we have not included all the possible functions or analysis options that are available for detecting differential methylation, we have demonstrated a common and well used workflow that we regularly use in our own analysis. Specifically, we have not demonstrated more complex types of analyses such as removing unwanted variation in a differential methylation study (
                <xref ref-type="bibr" rid="ref-21">Leek 
                    <italic toggle="yes">et al.</italic>, 2012</xref>; 
                <xref ref-type="bibr" rid="ref-23">Maksimovic 
                    <italic toggle="yes">et al.</italic>, 2015</xref>; 
                <xref ref-type="bibr" rid="ref-37">Teschendorff 
                    <italic toggle="yes">et al.</italic>, 2011</xref>), block finding (
                <xref ref-type="bibr" rid="ref-1">Aryee 
                    <italic toggle="yes">et al.</italic>, 2014</xref>; 
                <xref ref-type="bibr" rid="ref-13">Hansen 
                    <italic toggle="yes">et al.</italic>, 2011</xref>) or A/B compartment prediction (
                <xref ref-type="bibr" rid="ref-11">Fortin &amp; Hansen, 2015</xref>). Our differential methylation workflow presented here demonstrates how to read in data, perform quality control and filtering, normalisation and differential methylation testing. In addition we demonstrate analysis for differential variability, gene set testing and estimating cell type composition. One important aspect of exploring results of an analysis is visualisation and we also provide an example of generating region-level views of the data.</p>
        </sec>
        <sec>
            <title>Software versions</title>
            <p>The R markdown file and/or R script for this version workflow can be downloaded from: 
                <ext-link ext-link-type="uri" xlink:href="https://github.com/Oshlack/MethylationAnalysisWorkflow/tree/version3/scripts">https://github.com/Oshlack/MethylationAnalysisWorkflow/tree/version3/scripts</ext-link>. A 
                <italic toggle="yes">live</italic> version of this workflow that is regularly built using updated Bioconductor packages can be accessed on the Bioconductor website: 

                <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/help/workflows/methylationArrayAnalysis/">https://www.bioconductor.org/help/workflows/methylationArrayAnalysis/</ext-link>. The version of the workflow presented here uses the following packages available from Bioconductor (release 3.4):</p>
            <p>
                <preformat orientation="portrait" position="float" preformat-type="computer code" xml:space="preserve">
                    <styled-content style="font-size:15px;color:#214A87;">sessionInfo</styled-content>
                    <styled-content style="font-size:15px;">()</styled-content>


                    <styled-content style="font-size:15px;">## R version 3.3.1 (2016-06-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: CentOS release 6.7 (Final)
##
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
##  [1] splines   grid      stats4    parallel  stats     graphics  grDevices
##  [8] utils     datasets  methods   base
##
## other attached packages:
##  [1] stringr_1.2.0
##  [2] DMRcate_1.10.8
##  [3] DMRcatedata_1.10.1
##  [4] DSS_2.14.0
##  [5] bsseq_1.10.0
##  [6] Gviz_1.18.2
##  [7] minfiData_0.20.0
##  [8] matrixStats_0.51.0
##  [9] missMethyl_1.8.0
## [10] RColorBrewer_1.1-2
## [11] IlluminaHumanMethylation450kmanifest_0.4.0
## [12] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
## [13] minfi_1.20.2
## [14] bumphunter_1.14.0
## [15] locfit_1.5-9.1
## [16] iterators_1.0.8
## [17] foreach_1.4.3
## [18] Biostrings_2.42.1
## [19] XVector_0.14.1
## [20] SummarizedExperiment_1.4.0
## [21] GenomicRanges_1.26.4
## [22] GenomeInfoDb_1.10.3
## [23] IRanges_2.8.2
## [24] S4Vectors_0.12.2
## [25] Biobase_2.34.0
## [26] BiocGenerics_0.20.0
## [27] limma_3.30.13
##
## loaded via a namespace (and not attached):
##   [1] colorspace_1.3-2
##   [2] siggenes_1.48.0
##   [3] mclust_5.2.3
##   [4] rprojroot_1.2
##   [5] biovizBase_1.22.0
##   [6] htmlTable_1.9
##   [7] base64enc_0.1-3
##   [8] dichromat_2.0-0
##   [9] base64_2.0
##  [10] interactiveDisplayBase_1.12.0
##  [11] AnnotationDbi_1.36.2
##  [12] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
##  [13] codetools_0.2-15
##  [14] R.methodsS3_1.7.1
##  [15] methylumi_2.20.0
##  [16] knitr_1.15.1
##  [17] Formula_1.2-1
##  [18] Rsamtools_1.26.1
##  [19] annotate_1.52.1
##  [20] cluster_2.0.5
##  [21] GO.db_3.4.0
##  [22] R.oo_1.21.0
##  [23] shiny_1.0.0
##  [24] httr_1.2.1
##  [25] backports_1.0.5
##  [26] assertthat_0.1
##  [27] Matrix_1.2-8
##  [28] lazyeval_0.2.0
##  [29] acepack_1.4.1
##  [30] htmltools_0.3.5
##  [31] tools_3.3.1
##  [32] gtable_0.2.0
##  [33] doRNG_1.6
##  [34] Rcpp_0.12.9
##  [35] multtest_2.30.0
##  [36] preprocessCore_1.36.0
##  [37] nlme_3.1-131
##  [38] rtracklayer_1.34.2
##  [39] mime_0.5
##  [40] ensembldb_1.6.2
##  [41] rngtools_1.2.4
##  [42] gtools_3.5.0
##  [43] statmod_1.4.29
##  [44] XML_3.98-1.5
##  [45] beanplot_1.2
##  [46] org.Hs.eg.db_3.4.0
##  [47] AnnotationHub_2.6.5
##  [48] zlibbioc_1.20.0
##  [49] MASS_7.3-45
##  [50] scales_0.4.1
##  [51] BSgenome_1.42.0
##  [52] VariantAnnotation_1.20.3
##  [53] BiocInstaller_1.24.0
##  [54] GEOquery_2.40.0
##  [55] yaml_2.1.14
##  [56] memoise_1.0.0
##  [57] gridExtra_2.2.1
##  [58] ggplot2_2.2.1
##  [59] pkgmaker_0.22
##  [60] biomaRt_2.30.0
##  [61] rpart_4.1-10
##  [62] reshape_0.8.6
##  [63] latticeExtra_0.6-28
##  [64] stringi_1.1.2
##  [65] RSQLite_1.1-2
##  [66] highr_0.6
##  [67] genefilter_1.56.0
##  [68] permute_0.9-4
##  [69] checkmate_1.8.2
##  [70] GenomicFeatures_1.26.3
##  [71] BiocParallel_1.8.1
##  [72] bitops_1.0-6
##  [73] nor1mix_1.2-2
##  [74] evaluate_0.10
##  [75] lattice_0.20-34
##  [76] ruv_0.9.6
##  [77] GenomicAlignments_1.10.1
##  [78] htmlwidgets_0.8
##  [79] plyr_1.8.4
##  [80] magrittr_1.5
##  [81] R6_2.2.0
##  [82] Hmisc_4.0-2
##  [83] DBI_0.6
##  [84] foreign_0.8-67
##  [85] survival_2.40-1
##  [86] RCurl_1.95-4.8
##  [87] nnet_7.3-12
##  [88] tibble_1.2
##  [89] rmarkdown_1.3
##  [90] data.table_1.10.4
##  [91] digest_0.6.12
##  [92] xtable_1.8-2
##  [93] httpuv_1.3.3
##  [94] illuminaio_0.16.0
##  [95] R.utils_2.5.0
##  [96] openssl_0.9.6
##  [97] munsell_0.4.3
##  [98] registry_0.3
##  [99] BiasedUrn_1.07
## [100] quadprog_1.5-5</styled-content>
				</preformat>
			</p>
        </sec>
    </body>
    <back>
        <ref-list>
            <ref id="ref-1">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aryee</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jaffe</surname>
                            <given-names>AE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Corrada-Bravo</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2014</year>;<volume>30</volume>(<issue>10</issue>):<fpage>1363</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">24478339</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btu049</pub-id>
                    <pub-id pub-id-type="pmcid">4016708</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-2">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Aryee</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ladd-Acosta</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Accurate genome-scale percentage DNA methylation estimates from microarray data.</article-title>
                    <source>

                        <italic toggle="yes">Biostatistics.</italic>
</source>
                    <year>2011</year>;<volume>12</volume>(<issue>2</issue>):<fpage>197</fpage>&#x2013;<lpage>210</lpage>.
                    <pub-id pub-id-type="pmid">20858772</pub-id>
                    <pub-id pub-id-type="doi">10.1093/biostatistics/kxq055</pub-id>
                    <pub-id pub-id-type="pmcid">3062148</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-3">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Benjamini</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hochberg</surname>
                            <given-names>Y</given-names>
                        </name>
</person-group>:
                    <article-title>Controlling the false discovery rate: a practical and powerful approach to multiple testing.</article-title>
                    <source>

                        <italic toggle="yes">J R Statis Soc B.</italic>
</source>
                    <year>1995</year>;<volume>57</volume>(<issue>1</issue>):<fpage>289</fpage>&#x2013;<lpage>300</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.stat.purdue.edu/~doerge/BIOINFORM.D/FALL06/Benjamini%20and%20Y%20FDR.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-4">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bibikova</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Barnes</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tsan</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>High density DNA methylation array with single CpG site resolution.</article-title>
                    <source>

                        <italic toggle="yes">Genomics.</italic>
</source>Elsevier Inc.,<year>2011</year>;<volume>98</volume>(<issue>4</issue>):<fpage>288</fpage>&#x2013;<lpage>95</lpage>.
                    <pub-id pub-id-type="pmid">21839163</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.ygeno.2011.07.007</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-5">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bibikova</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Le</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Barnes</surname>
                            <given-names>B</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome-wide DNA methylation profiling using Infinium
                        <sup>&#x00ae;</sup> assay.</article-title>
                    <source>

                        <italic toggle="yes">Epigenomics.</italic>
</source>Future Medicine Ltd London, UK,<year>2009</year>;<volume>1</volume>(<issue>1</issue>):<fpage>177</fpage>&#x2013;<lpage>200</lpage>.
                    <pub-id pub-id-type="pmid">22122642</pub-id>
                    <pub-id pub-id-type="doi">10.2217/epi.09.14</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-6">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bird</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>DNA methylation patterns and epigenetic memory.</article-title>
                    <source>

                        <italic toggle="yes">Genes Dev.</italic>
</source>
                    <year>2002</year>;<volume>16</volume>(<issue>1</issue>):<fpage>6</fpage>&#x2013;<lpage>21</lpage>.
                    <pub-id pub-id-type="pmid">11782440</pub-id>
                    <pub-id pub-id-type="doi">10.1101/gad.947102</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-7">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>YA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lemire</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Choufani</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray.</article-title>
                    <source>

                        <italic toggle="yes">Epigenetics.</italic>
</source>
                    <year>2013</year>;<volume>8</volume>(<issue>2</issue>):<fpage>203</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="pmid">23314698</pub-id>
                    <pub-id pub-id-type="doi">10.4161/epi.23470</pub-id>
                    <pub-id pub-id-type="pmcid">3592906</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-8">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Cruickshank</surname>
                            <given-names>MN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oshlack</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Theda</surname>
                            <given-names>C</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Analysis of epigenetic changes in survivors of preterm birth reveals the effect of gestational age and evidence for a long term legacy.</article-title>
                    <source>

                        <italic toggle="yes">Genome Med.</italic>
</source>
                    <year>2013</year>;<volume>5</volume>(<issue>10</issue>):<fpage>96</fpage>.
                    <pub-id pub-id-type="pmid">24134860</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gm500</pub-id>
                    <pub-id pub-id-type="pmcid">3978871</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-9">
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Davis</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Du</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bilke</surname>
                            <given-names>S</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Methylumi: Handle Illumina Methylation Data</article-title>.<year>2015</year>.
                    <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/3.3/bioc/manuals/methylumi/man/methylumi.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-10">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Du</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>CC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>BioMed Central Ltd,<year>2010</year>;<volume>11</volume>(<issue>1</issue>):<fpage>587</fpage>.
                    <pub-id pub-id-type="pmid">21118553</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-11-587</pub-id>
                    <pub-id pub-id-type="pmcid">3012676</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-11">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Fortin</surname>
                            <given-names>JP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hansen</surname>
                            <given-names>KD</given-names>
                        </name>
</person-group>:
                    <article-title>Reconstructing A/B compartments as revealed by Hi-C using long-range correlations in epigenetic data.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>BioMed Central,<year>2015</year>;<volume>16</volume>(<issue>1</issue>):<fpage>180</fpage>.
                    <pub-id pub-id-type="pmid">26316348</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-015-0741-y</pub-id>
                    <pub-id pub-id-type="pmcid">4574526</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-12">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Fortin</surname>
                            <given-names>JP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Labbe</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lemire</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Functional normalization of 450k methylation array data improves replication in large cancer studies.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2014</year>;<volume>15</volume>(<issue>12</issue>):<fpage>503</fpage>.
                    <pub-id pub-id-type="pmid">25599564</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-014-0503-2</pub-id>
                    <pub-id pub-id-type="pmcid">4283580</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-13">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hansen</surname>
                            <given-names>KD</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Timp</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bravo</surname>
                            <given-names>HC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Increased methylation variation in epigenetic domains across cancer types.</article-title>
                    <source>

                        <italic toggle="yes">Nat Genet.</italic>
</source>
                    <year>2011</year>;<volume>43</volume>(<issue>8</issue>):<fpage>768</fpage>&#x2013;<lpage>75</lpage>.
                    <pub-id pub-id-type="pmid">21706001</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ng.865</pub-id>
                    <pub-id pub-id-type="pmcid">3145050</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-14">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Heyn</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ferreira</surname>
                            <given-names>HJ</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Distinct DNA methylomes of newborns and centenarians.</article-title>
                    <source>

                        <italic toggle="yes">Proc Natl Acad Sci U S A.</italic>
</source>
                    <year>2012</year>;<volume>109</volume>(<issue>26</issue>):<fpage>10522</fpage>&#x2013;<lpage>7</lpage>.
                    <pub-id pub-id-type="pmid">22689993</pub-id>
                    <pub-id pub-id-type="doi">10.1073/pnas.1120658109</pub-id>
                    <pub-id pub-id-type="pmcid">3387108</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-15">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hicks</surname>
                            <given-names>SC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Irizarry</surname>
                            <given-names>RA</given-names>
                        </name>
</person-group>:
                    <article-title>
                        <italic toggle="yes">quantro</italic>: a data-driven approach to guide the choice of an appropriate normalization method.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2015</year>;<volume>16</volume>(<issue>1</issue>):<fpage>117</fpage>.
                    <pub-id pub-id-type="pmid">26040460</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-015-0679-0</pub-id>
                    <pub-id pub-id-type="pmcid">4495646</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-16">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Houseman</surname>
                            <given-names>EA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Accomando</surname>
                            <given-names>WP</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Koestler</surname>
                            <given-names>DC</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>DNA methylation arrays as surrogate measures of cell mixture distribution.</article-title>
                    <source>

                        <italic toggle="yes">BMC Bioinformatics.</italic>
</source>
                    <year>2012</year>;<volume>13</volume>(<issue>1</issue>):<fpage>86</fpage>.
                    <pub-id pub-id-type="pmid">22568884</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2105-13-86</pub-id>
                    <pub-id pub-id-type="pmcid">3532182</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-17">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Huber</surname>
                            <given-names>W</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Carey</surname>
                            <given-names>VJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gentleman</surname>
                            <given-names>R</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Orchestrating high-throughput genomic analysis with Bioconductor.</article-title>
                    <source>

                        <italic toggle="yes">Nat Methods.</italic>
</source>
                    <year>2015</year>;<volume>12</volume>(<issue>2</issue>):<fpage>115</fpage>&#x2013;<lpage>21</lpage>.
                    <pub-id pub-id-type="pmid">25633503</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nmeth.3252</pub-id>
                    <pub-id pub-id-type="pmcid">4509590</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-18">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jaffe</surname>
                            <given-names>AE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Irizarry</surname>
                            <given-names>RA</given-names>
                        </name>
</person-group>:
                    <article-title>Accounting for cellular heterogeneity is critical in epigenome-wide association studies.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2014</year>;<volume>15</volume>(<issue>2</issue>):<fpage>R31</fpage>.
                    <pub-id pub-id-type="pmid">24495553</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gb-2014-15-2-r31</pub-id>
                    <pub-id pub-id-type="pmcid">4053810</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-19">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jaffe</surname>
                            <given-names>AE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Murakami</surname>
                            <given-names>P</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies.</article-title>
                    <source>

                        <italic toggle="yes">Int J Epidemiol.</italic>
</source>
                    <year>2012</year>;<volume>41</volume>(<issue>1</issue>):<fpage>200</fpage>&#x2013;<lpage>209</lpage>.
                    <pub-id pub-id-type="pmid">22422453</pub-id>
                    <pub-id pub-id-type="doi">10.1093/ije/dyr238</pub-id>
                    <pub-id pub-id-type="pmcid">3304533</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-20">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Laird</surname>
                            <given-names>PW</given-names>
                        </name>
</person-group>:
                    <article-title>The power and the promise of DNA methylation markers.</article-title>
                    <source>

                        <italic toggle="yes">Nat Rev Cancer.</italic>
</source>
                    <year>2003</year>;<volume>3</volume>(<issue>4</issue>):<fpage>253</fpage>&#x2013;<lpage>66</lpage>.
                    <pub-id pub-id-type="pmid">12671664</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrc1045</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-21">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Leek</surname>
                            <given-names>JT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Johnson</surname>
                            <given-names>WE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Parker</surname>
                            <given-names>HS</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The sva package for removing batch effects and other unwanted variation in high-throughput experiments.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2012</year>;<volume>28</volume>(<issue>6</issue>):<fpage>882</fpage>&#x2013;<lpage>3</lpage>.
                    <pub-id pub-id-type="pmid">22257669</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts034</pub-id>
                    <pub-id pub-id-type="pmcid">3307112</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-22">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lonnstedt</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Speed</surname>
                            <given-names>T</given-names>
                        </name>
</person-group>:
                    <article-title>Replicated Microarray Data.</article-title>
                    <source>

                        <italic toggle="yes">Statistica Sinica.</italic>
</source>
                    <year>2002</year>;<volume>12</volume>(<issue>6</issue>):<fpage>31</fpage>&#x2013;<lpage>46</lpage>.
                    <ext-link ext-link-type="uri" xlink:href="https://pdfs.semanticscholar.org/056e/7774cd48d90dad92fbac784baa1837459aec.pdf">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-23">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Maksimovic</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gagnon-Bartsch</surname>
                            <given-names>JA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Speed</surname>
                            <given-names>TP</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Removing unwanted variation in a differential methylation analysis of Illumina HumanMethylation450 array data.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2015</year>;<volume>43</volume>(<issue>16</issue>):<fpage>e106</fpage>.
                    <pub-id pub-id-type="pmid">25990733</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkv526</pub-id>
                    <pub-id pub-id-type="pmcid">4652745</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-24">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Maksimovic</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Gordon</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oshlack</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>SWAN: Subset-quantile within array normalization for illumina infinium HumanMethylation450 BeadChips.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2012</year>;<volume>13</volume>(<issue>6</issue>):<fpage>R44</fpage>.
                    <pub-id pub-id-type="pmid">22703947</pub-id>
                    <pub-id pub-id-type="doi">10.1186/gb-2012-13-6-r44</pub-id>
                    <pub-id pub-id-type="pmcid">3446316</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-25">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Mancuso</surname>
                            <given-names>FM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Montfort</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Carreras</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>HumMeth27QCReport: an R package for quality control and primary analysis of Illumina Infinium methylation data.</article-title>
                    <source>

                        <italic toggle="yes">BMC Res Notes.</italic>
</source>
                    <year>2011</year>;<volume>4</volume>:<fpage>546</fpage>.
                    <pub-id pub-id-type="pmid">22182516</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1756-0500-4-546</pub-id>
                    <pub-id pub-id-type="pmcid">3285701</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-26">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Morris</surname>
                            <given-names>TJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Butcher</surname>
                            <given-names>LM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Feber</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>ChAMP: 450k Chip Analysis Methylation Pipeline.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2014</year>;<volume>30</volume>(<issue>3</issue>):<fpage>428</fpage>&#x2013;<lpage>30</lpage>.
                    <pub-id pub-id-type="pmid">24336642</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btt684</pub-id>
                    <pub-id pub-id-type="pmcid">3904520</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-27">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Peters</surname>
                            <given-names>TJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Buckley</surname>
                            <given-names>MJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Statham</surname>
                            <given-names>AL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">De novo</italic> identification of differentially methylated regions in the human genome.</article-title>
                    <source>

                        <italic toggle="yes">Epigenetics Chromatin.</italic>
</source>
                    <year>2015</year>;<volume>8</volume>(<issue>1</issue>):<fpage>6</fpage>.
                    <pub-id pub-id-type="pmid">25972926</pub-id>
                    <pub-id pub-id-type="pmcid">4429355</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-28">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Phipson</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Maksimovic</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oshlack</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>missMethyl: an R package for analyzing data from Illumina&#x2019;s HumanMethylation450 platform.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2016</year>;<volume>32</volume>(<issue>2</issue>):<fpage>286</fpage>&#x2013;<lpage>88</lpage>.
                    <pub-id pub-id-type="pmid">26424855</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btv560</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-29">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Phipson</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Oshlack</surname>
                            <given-names>A</given-names>
                        </name>
</person-group>:
                    <article-title>DiffVar: a new method for detecting differential variability with application to methylation in cancer and aging.</article-title>
                    <source>

                        <italic toggle="yes">Genome Biol.</italic>
</source>
                    <year>2014</year>;<volume>15</volume>(<issue>9</issue>):<fpage>465</fpage>.
                    <pub-id pub-id-type="pmid">25245051</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s13059-014-0465-4</pub-id>
                    <pub-id pub-id-type="pmcid">4210618</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-30">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Pidsley</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Y Wong</surname>
                            <given-names>CC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Volta</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A data-driven approach to preprocessing Illumina 450K methylation array data.</article-title>
                    <source>

                        <italic toggle="yes">BMC Genomics.</italic>
</source>
                    <year>2013</year>;<volume>14</volume>(<issue>1</issue>):<fpage>293</fpage>.
                    <pub-id pub-id-type="pmid">23631413</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1471-2164-14-293</pub-id>
                    <pub-id pub-id-type="pmcid">3769145</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-31">
                <mixed-citation publication-type="journal">
                    <collab>R Core Team</collab>:
                    <article-title>R: A language and environment for statistical computing.</article-title>Vienna, Austria: R Foundation for Statistical Computing.<year>2014</year>.
                    <ext-link ext-link-type="uri" xlink:href="http://www.r-project.org/">Reference Source</ext-link>
                </mixed-citation>
            </ref>
            <ref id="ref-32">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ritchie</surname>
                            <given-names>ME</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Phipson</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">limma</italic> powers differential expression analyses for RNA-sequencing and microarray studies.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2015</year>;<volume>43</volume>(<issue>7</issue>):<fpage>e47</fpage>.
                    <pub-id pub-id-type="pmid">25605792</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkv007</pub-id>
                    <pub-id pub-id-type="pmcid">4402510</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-33">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smith</surname>
                            <given-names>ML</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Baggerly</surname>
                            <given-names>KA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bengtsson</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>
                        <italic toggle="yes">illuminaio</italic>: An open source IDAT parsing tool for Illumina microarrays [version 1; referees: 2 approved].</article-title>
                    <source>

                        <italic toggle="yes">F1000Res.</italic>
</source>
                    <year>2013</year>;<volume>2</volume>:<fpage>264</fpage>.
                    <pub-id pub-id-type="pmid">24701342</pub-id>
                    <pub-id pub-id-type="doi">10.12688/f1000research.2-264.v1</pub-id>
                    <pub-id pub-id-type="pmcid">3968891</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-34">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Smyth</surname>
                            <given-names>GK</given-names>
                        </name>
</person-group>:
                    <article-title>Linear models and empirical Bayes methods for assessing differential expression in microarray experiments.</article-title>
                    <source>

                        <italic toggle="yes">Stat Appl Genet Mol Biol.</italic>
</source>
                    <year>2004</year>;<volume>3</volume>(<issue>1</issue>): Article 3.
                    <pub-id pub-id-type="pmid">16646809</pub-id>
                    <pub-id pub-id-type="doi">10.2202/1544-6115.1027</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-35">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Sun</surname>
                            <given-names>Z</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Chai</surname>
                            <given-names>HS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Batch effect correction for genome-wide methylation data with Illumina Infinium platform.</article-title>
                    <source>

                        <italic toggle="yes">BMC Med Genomics.</italic>
</source>
                    <year>2011</year>;<volume>4</volume>:<fpage>84</fpage>.
                    <pub-id pub-id-type="pmid">22171553</pub-id>
                    <pub-id pub-id-type="doi">10.1186/1755-8794-4-84</pub-id>
                    <pub-id pub-id-type="pmcid">3265417</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-36">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Teschendorff</surname>
                            <given-names>AE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Marabita</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lechner</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2013</year>;<volume>29</volume>(<issue>2</issue>):<fpage>189</fpage>&#x2013;<lpage>96</lpage>.
                    <pub-id pub-id-type="pmid">23175756</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/bts680</pub-id>
                    <pub-id pub-id-type="pmcid">3546795</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-37">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Teschendorff</surname>
                            <given-names>AE</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhuang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Widschwendter</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <article-title>Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies.</article-title>
                    <source>

                        <italic toggle="yes">Bioinformatics.</italic>
</source>
                    <year>2011</year>;<volume>27</volume>(<issue>11</issue>):<fpage>1496</fpage>&#x2013;<lpage>1505</lpage>.
                    <pub-id pub-id-type="pmid">21471010</pub-id>
                    <pub-id pub-id-type="doi">10.1093/bioinformatics/btr171</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-38">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Touleimat</surname>
                            <given-names>N</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tost</surname>
                            <given-names>J</given-names>
                        </name>
</person-group>:
                    <article-title>Complete pipeline for Infinium
                        <sup>&#x00ae;</sup> Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation.</article-title>
                    <source>

                        <italic toggle="yes">Epigenomics.</italic>
</source>
                    <year>2012</year>;<volume>4</volume>(<issue>3</issue>):<fpage>325</fpage>&#x2013;<lpage>41</lpage>.
                    <pub-id pub-id-type="pmid">22690668</pub-id>
                    <pub-id pub-id-type="doi">10.2217/epi.12.21</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-39">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Triche</surname>
                            <given-names>TJ</given-names>
                            <suffix>Jr</suffix>
                        </name>

                        <name name-style="western">
                            <surname>Weisenberger</surname>
                            <given-names>DJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Van Den Berg</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Low-level processing of Illumina Infinium DNA Methylation BeadArrays.</article-title>
                    <source>

                        <italic toggle="yes">Nucleic Acids Res.</italic>
</source>
                    <year>2013</year>;<volume>41</volume>(<issue>7</issue>):<fpage>e90</fpage>.
                    <pub-id pub-id-type="pmid">23476028</pub-id>
                    <pub-id pub-id-type="doi">10.1093/nar/gkt090</pub-id>
                    <pub-id pub-id-type="pmcid">3627582</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-40">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>D</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Comparison of different normalization assumptions for analyses of DNA methylation data from the cancer genome.</article-title>
                    <source>

                        <italic toggle="yes">Gene.</italic>
</source>
                    <year>2012</year>;<volume>506</volume>(<issue>1</issue>):<fpage>36</fpage>&#x2013;<lpage>42</lpage>.
                    <pub-id pub-id-type="pmid">22771920</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.gene.2012.06.075</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-41">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>H</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Caffo</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jaffee</surname>
                            <given-names>HA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Redefining CpG islands using hidden Markov models.</article-title>
                    <source>

                        <italic toggle="yes">Biostatistics.</italic>
</source>
                    <year>2010</year>;<volume>11</volume>(<issue>3</issue>):<fpage>499</fpage>&#x2013;<lpage>514</lpage>.
                    <pub-id pub-id-type="pmid">20212320</pub-id>
                    <pub-id pub-id-type="doi">10.1093/biostatistics/kxq005</pub-id>
                    <pub-id pub-id-type="pmcid">2883304</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-42">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Joubert</surname>
                            <given-names>BR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Kuan</surname>
                            <given-names>PF</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A systematic assessment of normalization approaches for the Infinium 450K methylation platform.</article-title>
                    <source>

                        <italic toggle="yes">Epigenetics.</italic>
</source>
                    <year>2014</year>;<volume>9</volume>(<issue>2</issue>):<fpage>318</fpage>&#x2013;<lpage>29</lpage>.
                    <pub-id pub-id-type="pmid">24241353</pub-id>
                    <pub-id pub-id-type="doi">10.4161/epi.27119</pub-id>
                    <pub-id pub-id-type="pmcid">3962542</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref-43">
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Maksimovic</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Naselli</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Genome-wide DNA methylation analysis identifies hypomethylated genes regulated by FOXP3 in human regulatory T cells.</article-title>
                    <source>

                        <italic toggle="yes">Blood.</italic>
</source>
                    <year>2013</year>;<volume>122</volume>(<issue>16</issue>):<fpage>2823</fpage>&#x2013;<lpage>36</lpage>.
                    <pub-id pub-id-type="pmid">23974203</pub-id>
                    <pub-id pub-id-type="doi">10.1182/blood-2013-02-481788</pub-id>
                    <pub-id pub-id-type="pmcid">3798997</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report17311">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9951.r17311</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Peters</surname>
                        <given-names>Timothy J</given-names>
                    </name>
                    <xref ref-type="aff" rid="r17311a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r17311a1">
                    <label>1</label>Garvan Institute of Medical Research, Darlinghurst, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>I am the primary author of the DMRcate package</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>31</day>
                <month>10</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Peters TJ</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport17311" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>I am happy with the amendments made, and approve of the publication.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report15416">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9951.r15416</article-id>
            <title-group>
                <article-title>Reviewer response for version 2</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Hickey</surname>
                        <given-names>Peter&#x00a0;F.&#x00a0;</given-names>
                    </name>
                    <xref ref-type="aff" rid="r15416a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8153-6258</uri>
                </contrib>
                <aff id="r15416a1">
                    <label>1</label>Johns Hopkins University, Baltimore, MD, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>3</day>
                <month>8</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Hickey P&#x00a0;</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport15416" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.2"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>Thank you for making the data easily available and re-running the workflow with the latest release version of Bioconductor, Jovanna. My minor comments on the initial submission have also been addressed to my satisfaction. Therefore, I happily approve the revised manuscript. I believe it will be a helpful resource to the Bioconductor community and those wishing to learn best practises for analysing DNA methylation array data.</p>
            <p> </p>
            <p> Minor comments I have on the revised manuscript: 
                <list list-type="bullet">
                    <list-item>
                        <p>I got slightly different numbers and output when running the workflow (see below). I believe this is due to a bug being fixed in in v1.18.4 (see 
                            <ext-link ext-link-type="uri" xlink:href="https://github.com/Bioconductor-mirror/minfi/commit/c16b28e2f4c7127696cbea807f668d0a10c54b37">https://github.com/Bioconductor-mirror/minfi/commit/c16b28e2f4c7127696cbea807f668d0a10c54b37</ext-link>); I tested using minfi v1.18.6. These differences were first observable when running This affects all subsequent output such as which probes are identified as DMPs. Unfortunately, such changes in output are par for the course in bioinformatics software. These differences are outside the author's control, and I do not believe these affect the quality and usefulness of the workflow. I'm not sure anything can be done except to point out that such differences may exist due to different software versions.</p>
                    </list-item>
                </list> &gt; head(mVals[,1:5])&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;</p>
            <p> &#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0; &#x00a0; &#x00a0; &#x00a0; &#x00a0; &#x00a0;&#x00a0;&#x00a0; naive.1 &#x00a0; &#x00a0; rTreg.2&#x00a0; act_naive.3&#x00a0;&#x00a0;&#x00a0; naive.4&#x00a0; act_naive.5</p>
            <p> cg13869341&#x00a0; 2.205808&#x00a0; 2.205808&#x00a0;&#x00a0;&#x00a0; 2.167697&#x00a0; 2.173122&#x00a0;&#x00a0;&#x00a0; 2.106660</p>
            <p> cg24669183&#x00a0; 2.169414&#x00a0; 2.235964&#x00a0;&#x00a0;&#x00a0; 2.280734&#x00a0; 1.632309&#x00a0;&#x00a0;&#x00a0; 2.184435</p>
            <p> cg15560884&#x00a0; 1.761176&#x00a0; 1.577578&#x00a0;&#x00a0;&#x00a0; 1.597503&#x00a0; 1.777486&#x00a0;&#x00a0;&#x00a0; 1.764999</p>
            <p> cg01014490 -2.918305 -2.979692&#x00a0;&#x00a0; -4.042347 -3.448734&#x00a0;&#x00a0; -3.189924</p>
            <p> cg17505339&#x00a0; 3.082191&#x00a0; 3.924931&#x00a0;&#x00a0;&#x00a0; 4.163206&#x00a0; 3.255373&#x00a0;&#x00a0;&#x00a0; 3.654134</p>
            <p> cg11954957&#x00a0; 1.546401&#x00a0; 1.912204&#x00a0;&#x00a0;&#x00a0; 1.727910&#x00a0; 2.441267&#x00a0;&#x00a0;&#x00a0; 1.618331</p>
            <p> &#x00a0; 
                <list list-type="bullet">
                    <list-item>
                        <p>There is a typo (extra whitespace) in the BiocViews URL</p>
                    </list-item>
                    <list-item>
                        <p>I was unable to get the Gviz plot to correctly render. Unsure what's going on here as the Gviz version on my machine matches that in the workflow (v1.16.1)</p>
                    </list-item>
                </list> &gt; plotTracks(tracks, from=minbase, to=maxbase, showTitle=TRUE, add53=TRUE,</p>
            <p> +&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0;&#x00a0; add35=TRUE, grid=TRUE, lty.grid=3, sizes=sizes, length(tracks))</p>
            <p> Error in valid.viewport(x, y, width, height, just, gp, clip, xscale, yscale,&#x00a0; :</p>
            <p> &#x00a0; invalid 'yscale' in viewport</p>
            <p> In addition: Warning messages:</p>
            <p> 1: In min(x) : no non-missing arguments to min; returning Inf</p>
            <p> 2: In max(x) : no non-missing arguments to max; returning -Inf</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report14247">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9514.r14247</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Risso</surname>
                        <given-names>Davide</given-names>
                    </name>
                    <xref ref-type="aff" rid="r14247a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8508-5012</uri>
                </contrib>
                <aff id="r14247a1">
                    <label>1</label>Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>22</day>
                <month>6</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Risso D</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport14247" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>As someone who has experience with R/Bioconductor and with genomics data, but not direct experience analyzing methylation array data, I found the workflow very useful and I would suggest it to anyone wanting to start analyzing this type of data.</p>
            <p> </p>
            <p> I do agree with the other reviewers that the value of the workflow will be greatly increased if the dataset used was available as an R object. The authors should consider submitting an experiment data package to Bioconductor to accompany the workflow. Alternatively, they could provide the dataset as a supplementary file.</p>
            <p> </p>
            <p> As for the analysis itself, I only have one major question. Note that I do not have direct experience analyzing methylation array, so this is a genuine question rather than a criticism.</p>
            <p> </p>
            <p> In gene expression analysis, we tend to perform filtering prior to normalization, while the authors here first normalize the data by quantile normalization and then filter out probes that are low quality and/or affected by SNPs. Wouldn't it be safer to perform filtering before normalization? I understand that given the few probes affected, the order has likely very little effect in this dataset. But I naively imagine that if there are many problematic probes and, say, the quality of the samples is confounded with the biology, there could be issues in using low quality probes for normalization.</p>
            <p> </p>
            <p> Other minor points: 
                <list list-type="bullet">
                    <list-item>
                        <p>I agree that the code should be re-run with the latest release of R and Bioconductor.</p>
                    </list-item>
                    <list-item>
                        <p>In the definition of \beta, \alpha should be defined, and its default value in getBeta() should be specified.</p>
                    </list-item>
                    <list-item>
                        <p>Spelling: most of the article uses British English spelling, but the word "normalization" is sometimes (but not always) spelled in American English.</p>
                    </list-item>
                    <list-item>
                        <p>A sentence describing what is the procedure implemented in preprocessQuantile() is needed for people not familiar with normalization.</p>
                    </list-item>
                    <list-item>
                        <p>I agree that it would be useful to provide a brief description of what is a contrasts matrix as this section could be confusing for people unfamiliar with statistical models.</p>
                    </list-item>
                    <list-item>
                        <p>For the same reason, the authors should add a brief explanation of the problem of multiple testing and what is the false discovery rate. Or at least provide references to the appropriate literature.</p>
                    </list-item>
                </list>
            </p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment2079-14247">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Maksimovic</surname>
                            <given-names>Jovana</given-names>
                        </name>
                        <aff>Murdoch Childrens Research Institute, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>7</month>
                    <year>2016</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thanks for your review, Davide.&#x00a0;</p>
                <p> </p>
                <p> While we agree that normalisation post-filtering makes sense, there are some practical aspects with the data objects that minfi uses which makes this difficult. Many (but not all) normalisation procedures in minfi accept an rgSet object, which can be thought of as a raw data object, which cannot easily be subset by CpG site. These normalisation procedures then output a different type of data object, such as MethylSet or GenomicRatioSet, which are much easier to work with in terms of filtering out problematic CpG sites. Due to the sheer number of CpG sites observed per sample (&gt;450,000) we believe it shouldn&#x2019;t make too much difference for most datasets, especially if very poor quality samples are excluded prior to normalisation, although it is possible that there are exceptions to this.</p>
                <p> </p>
                <p> Response to minor points: 
                    <list list-type="bullet">
                        <list-item>
                            <p>We have spent some time modifying the workflow to run with the latest R and Bioconductor.</p>
                        </list-item>
                        <list-item>
                            <p>We have added additional details regarding beta values, M-values and the offset in the paper.&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>We have changed "normalization" to "normalisation" throughout the text.</p>
                        </list-item>
                        <list-item>
                            <p>A sentence has been added about preprocessQuantile in the normalisation section.&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>We have included some additional explanation of contrast&#x00a0;matrices.&#x00a0;</p>
                        </list-item>
                        <list-item>
                            <p>An additional paragraph was added explaining about the issues of multiple testing in very high dimensional data.</p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report14248">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9514.r14248</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Love</surname>
                        <given-names>Michael I.</given-names>
                    </name>
                    <xref ref-type="aff" rid="r14248a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8401-0545</uri>
                </contrib>
                <aff id="r14248a1">
                    <label>1</label>Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>21</day>
                <month>6</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Love MI</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport14248" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>I am not an expert in analysis of methylation data, and have not used the methylation packages mentioned in this workflow, so I reviewed the workflow as an uninitiated reader might approach it.</p>
            <p> </p>
            <p> Major comments:</p>
            <p> </p>
            <p> I found the workflow to be easy to follow and informative. The authors have done a good job summarizing a large and complex topic into an reasonable size for a workflow article, while still mentioning the various alternatives that are possible at each step. I appreciated the focus on EDA and checking the quality of results by eye, for example the M-values for the most significant tests and the MDS plots colored by different variables.</p>
            <p> </p>
            <p> I did not try to run the code, and I agree with the other two reviewers that the code and datasets should be made available and linked to from this workflow.</p>
            <p> </p>
            <p> Minor comments:</p>
            <p> </p>
            <p> The first time &#x201c;moderated t-statistics&#x201d; is mentioned, it would benefit to have a citation so that a reader who hasn&#x2019;t encountered this method before can read the reference, e.g. Smyth 2004.</p>
            <p> </p>
            <p> The first or second time IDAT files are mentioned, a small description of these would be useful, a little more than just that these are the raw files. Which platforms produce IDAT files? Are they compressed files? About how large are they?</p>
            <p> </p>
            <p> Figure 2: It wasn&#x2019;t obvious at first that the plot on the right is the same as the left but zoomed in.</p>
            <p> </p>
            <p> When discussing the choice of normalization depending on whether or not there are global changes across samples due to underlying biology, the authors might consider referencing the 
                <italic>quantro</italic>&#x00a0;article and Bioconductor package by Stephanie Hicks for determining whether there are global changes in genomic datasets across samples, and therefore whether quantile normalization is appropriate. Hicks has an example of whether or not to use quantile normalization for methylation data in the article. 
                <list list-type="bullet">
                    <list-item>
                        <p>
                            <ext-link ext-link-type="uri" xlink:href="https://www.bioconductor.org/packages/quantro">https://www.bioconductor.org/packages/quantro</ext-link>
                        </p>
                    </list-item>
                </list> Principal components is misspelled in the text: &#x201c;principle components&#x201d;</p>
            <p> </p>
            <p> In the paragraph above the call to 
                <italic>makeContrasts</italic>, it would be good to state in the text in one sentence what it is this function does, for the benefit of someone who has never performed linear modeling before. Likewise, to explicitly state that coef=1 is referencing the first column of the contrast matrix. It should be stated what is the B-statistic which orders the topTable.</p>
            <p> </p>
            <p> The authors should explain a bit more what is being shown in Figure 10 in the caption.</p>
            <p> </p>
            <p> In the text and code the authors have written DNAseI, but I believe the more common capitalization is DNaseI.</p>
            <p> </p>
            <p> The authors might consider commenting on the top GO categories and the associated FDR values. How far down the list should one look? Can the authors advise the reader how GO results should be reported? Is it fair to pick out the most relevant categories from this list and only report them?</p>
            <p> </p>
            <p> It wasn&#x2019;t clear to me the difference between the 
                <italic>gometh</italic> and 
                <italic>gsameth</italic> approaches.</p>
            <p> </p>
            <p> It would be good to provide references to literature for &#x201c;it has been hypothesised that highly variable CpGs in cancer are important for tumour progression&#x201d;.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <back>
            <ref-list>
                <title>References</title>
                <ref id="rep-ref-14248-1">
                    <label>1</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>Linear models and empirical bayes methods for assessing differential expression in microarray experiments.</article-title>
                        <source>
                            <italic>Stat Appl Genet Mol Biol</italic>
                        </source>.<year>2004</year>;<volume>3</volume>:
                        <elocation-id>10.2202/1544-6115.1027</elocation-id>
                        <fpage>Article3</fpage>
                        <pub-id pub-id-type="pmid">16646809</pub-id>
                        <pub-id pub-id-type="doi">10.2202/1544-6115.1027</pub-id>
                    </mixed-citation>
                </ref>
                <ref id="rep-ref-14248-2">
                    <label>2</label>
                    <mixed-citation publication-type="journal">
                        <person-group person-group-type="author"/>:
                        <article-title>quantro: a data-driven approach to guide the choice of an appropriate normalization method.</article-title>
                        <source>
                            <italic>Genome Biol</italic>
                        </source>.<year>2015</year>;<volume>16</volume>:
                        <elocation-id>10.1186/s13059-015-0679-0</elocation-id>
                        <fpage>117</fpage>
                        <pub-id pub-id-type="pmid">26040460</pub-id>
                        <pub-id pub-id-type="doi">10.1186/s13059-015-0679-0</pub-id>
                    </mixed-citation>
                </ref>
            </ref-list>
        </back>
        <sub-article article-type="response" id="comment2080-14248">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Maksimovic</surname>
                            <given-names>Jovana</given-names>
                        </name>
                        <aff>Murdoch Childrens Research Institute, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>7</month>
                    <year>2016</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thank you for reviewing our paper, Michael.</p>
                <p> In response to your comments/suggestions we have made the following changes: 
                    <list list-type="bullet">
                        <list-item>
                            <p>The Smyth 2004 citations have been added&#x00a0;the first time &#x201c;moderated t-statistics&#x201d; is mentioned</p>
                        </list-item>
                        <list-item>
                            <p>A description of IDAT files has been added to the text along with a reference to a Bioconductor package that is specifically for reading IDAT files.</p>
                        </list-item>
                        <list-item>
                            <p>We have added to the legend for&#x00a0;Figure 2 to clarify that the plot on the right is the same as the left but zoomed in.</p>
                        </list-item>
                        <list-item>
                            <p>There is now a reference to Hicks and quantro in included in the Normalisation section.</p>
                        </list-item>
                        <list-item>
                            <p>Spelling mistakes and typos have been fixed.</p>
                        </list-item>
                        <list-item>
                            <p>Function of 
                                <italic>makeContrasts</italic> is described:&#x00a0;See response to Davide Risso.</p>
                        </list-item>
                        <list-item>
                            <p>We now explicitly state that coef=1 is referencing the first column of the contrast matrix.</p>
                        </list-item>
                        <list-item>
                            <p>Included explanation for B-statistic and citation.</p>
                        </list-item>
                        <list-item>
                            <p>More detail about the plot has been added to the figure caption for Figure 10.</p>
                        </list-item>
                        <list-item>
                            <p>Changed DNAseI to DNasel</p>
                        </list-item>
                        <list-item>
                            <p>Typically we would consider GO categories that have associated FDRs less than 5% as significant. Some discussion of these points has been added to the gene set testing section.</p>
                        </list-item>
                        <list-item>
                            <p>The gometh function specifically tests only GO and KEGG pathways, whereas the gsameth is a more general function that requires the user to supply their own gene sets for testing.</p>
                        </list-item>
                        <list-item>
                            <p>We have changed the sentence &#x201c;it has been hypothesised that highly variable CpGs in cancer are important for tumour progression&#x201d;&#x00a0;to &#x201c;it has been hypothesised that highly variable CpGs in cancer may contribute to tumour heterogeneity&#x201d; and included the following reference:</p>
                            <p> </p>
                            <p> Hansen KD, Timp W, Bravo HC, Sabunciyan S, Langmead B, McDonald OG, Wen B, Wu H, Liu Y, Diep D, Briem E, Zhang K, Irizarry RA, Feinberg AP: Increased methylation variation in epigenetic domains across cancer types. Nat Genet. 2011, 43: 768-775</p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report14425">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9514.r14425</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Hickey</surname>
                        <given-names>Peter&#x00a0;F.&#x00a0;</given-names>
                    </name>
                    <xref ref-type="aff" rid="r14425a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0002-8153-6258</uri>
                </contrib>
                <aff id="r14425a1">
                    <label>1</label>Johns Hopkins University, Baltimore, MD, USA</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>17</day>
                <month>6</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Hickey P&#x00a0;</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport14425" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper is a well-written workflow for analysing DNA methylation microarrays using Bioconductor packages. A challenge in writing these workflows is to produce something that is opinionated enough to be useful and balanced enough to be fair to packages developed by other people; I believe the authors have struck the right balance.</p>
            <p> However, my overall assessment is "Approved With Reservations" because the data used in the workflow is not easily available and therefore the workflow cannot be tested out by the interested reader.</p>
            <p> I spent some time trying to compile the raw data from GEO, but to me this feels a bit too much to expect of the reader, especially when it is likely that the interested reader is a beginner or intermediate user of bioinformatics software. I strongly believe the workflow should either include code to curate/construct/download the necessary files such as SampleSheet.csv and the IDAT files or include a link to prepared example data that can be used right from the 'Loading the data' section of the workflow. For example, 
                <ext-link ext-link-type="uri" xlink:href="http://f1000research.com/articles/4-1070/v1">http://f1000research.com/articles/4-1070/v1</ext-link> uses data from the 
                <italic>airway</italic> Bioconductor package that can easily be installed by the reader to follow along with the workflow.</p>
            <p> My other main suggestion would be to re-run the code using the recently published Bioconductor version 3.3. I expect this might require some minor changes to the code, e.g., the minfi::read.450k* functions have been deprecated in favour of minfi::read.metharray* functions.</p>
            <p> I have some additional minor comments and suggestions that I will include once I'm able to run through and review the workflow from beginning to end.</p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment2057-14425">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Hickey</surname>
                            <given-names>Peter</given-names>
                        </name>
                        <aff>Walter and Eliza Hall Institute of Medical Research, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>6</day>
                    <month>7</month>
                    <year>2016</year>
                </pub-date>
            </front-stub>
            <body>
                <p>
                    <list list-type="bullet">
                        <list-item>
                            <p>p2: beta = M / (M + U + alpha), the alpha parameter should be explained. Also, both the definition of beta and Mvalue differ slightly from that given in the cited Du, P. et al. Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11, 587 (2010).</p>
                        </list-item>
                        <list-item>
                            <p>p3: Perhaps worth mentioning that a complete list of packages for analysing DNA methylation data can be accessed using BiocViews (https://www.bioconductor.org/packages/release/bioc/html/biocViews.html and https://www.bioconductor.org/packages/release/BiocViews.html#___DNAMethylation)</p>
                        </list-item>
                        <list-item>
                            <p>p4: "...loading all the package libraries..." should be "...loading all the packages..."</p>
                        </list-item>
                        <list-item>
                            <p>p4: Perhaps worth commenting on which of the loaded packages are methylation-focused and/or purpose of other packages, e.g., stringr, Gviz.</p>
                        </list-item>
                        <list-item>
                            <p>p4: This is *super* pedantic (sorry!): strictly speaking the IlluminaHumanMethylation450kmanifest package provides the Illumina manifest for the 450k array, which can then be accessed by using `minfi::getAnnotation()`</p>
                        </list-item>
                        <list-item>
                            <p>Figure 2: Not immediately obvious that righthand plot is zoomed in version of lefthand plot. The caption could better explain this.</p>
                        </list-item>
                        <list-item>
                            <p>p10: The code produces a warning. Would be helpful to the reader to comment on whether this is cause for concern in this case.</p>
                        </list-item>
                        <list-item>
                            <p>Figure 9: Wondering whether helpful to have each panel with y-axis = [0, 1]</p>
                        </list-item>
                        <list-item>
                            <p>p25: `islandData` apparently contains 0 ranges. This looks like a bug in the code.</p>
                        </list-item>
                        <list-item>
                            <p>p28: "For gene ontology testing (default), the user can specific collection = "GO" for KEGG testing collection = "KEGG""; this sentence seems incomplete or is perhaps missing a word</p>
                        </list-item>
                        <list-item>
                            <p>p29 and p30: The code produces a warning. Would be helpful to the reader to comment on whether this is cause for concern in this case.</p>
                        </list-item>
                        <list-item>
                            <p>The workflow uses multiple packages and it's not always clear where each function comes from. This could be clarified e.g., by namespacing functions such as `limma::plotMDS()` instead of `plotMDS()`</p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
        <sub-article article-type="response" id="comment2081-14425">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Maksimovic</surname>
                            <given-names>Jovana</given-names>
                        </name>
                        <aff>Murdoch Childrens Research Institute, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>7</month>
                    <year>2016</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thanks for taking the time to review our workflow, Peter.</p>
                <p> In response to your suggestion&#x00a0;we have made the data available and rerun the workflow using the latest R and Bioconductor.</p>
                <p> In response to your other comments:&#x00a0; 
                    <list list-type="bullet">
                        <list-item>
                            <p>p2: beta = M / (M + U + alpha), the alpha parameter should be explained. Also, both the definition of beta and Mvalue differ slightly from that given in the cited Du, P. et al. Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics 11, 587 (2010).</p>
                            <p> 
                                <italic>This has been addressed. See response to Davide Risso.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p3: Perhaps worth mentioning that a complete list of packages for analysing DNA methylation data can be accessed using BiocViews (https://www.bioconductor.org/packages/release/bioc/html/biocViews.html and https://www.bioconductor.org/packages/release/BiocViews.html#___DNAMethylation)</p>
                            <p> 
                                <italic>This has been added to the paper.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p4: "...loading all the package libraries..." should be "...loading all the packages..."</p>
                            <p> 
                                <italic>The text has been modified accordingly.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p4: Perhaps worth commenting on which of the loaded packages are methylation-focused and/or purpose of other packages, e.g., stringr, Gviz.</p>
                            <p> 
                                <italic>The text has been modified accordingly.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p4: This is *super* pedantic (sorry!): strictly speaking the IlluminaHumanMethylation450kmanifest package provides the Illumina manifest for the 450k array, which can then be accessed by using `minfi::getAnnotation()`</p>
                            <p> 
                                <italic>The text has been modified accordingly.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Figure 2: Not immediately obvious that righthand plot is zoomed in version of lefthand plot. The caption could better explain this.</p>
                            <p> 
                                <italic>This has been clarified in the figure caption.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p10: The code produces a warning. Would be helpful to the reader to comment on whether this is cause for concern in this case.</p>
                            <p> 
                                <italic>A sentence has been included that explains the reason for the waring.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Figure 9: Wondering whether helpful to have each panel with y-axis = [0, 1]</p>
                            <p> 
                                <italic>As we are trying to highlight the differences between the groups tested for 
                                    <bold>individual</bold>&#x00a0;CpGs and not comparing between CpGs, we feel that the axes are appropriate for the purposes of "sanity checking" the results of the statistical analysis.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p25: `islandData` apparently contains 0 ranges. This looks like a bug in the code.</p>
                            <p> 
                                <italic>This was due to the fact that there were not any CpG islands present in the region being plotted; we have selected another region to plot that does have a CpG island so that islandData is no longer empty.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p28: "For gene ontology testing (default), the user can specific collection = "GO" for KEGG testing collection = "KEGG""; this sentence seems incomplete or is perhaps missing a word</p>
                            <p> 
                                <italic>This sentence has been modified.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>p29 and p30: The code produces a warning. Would be helpful to the reader to comment on whether this is cause for concern in this case.</p>
                            <p> 
                                <italic>Added a sentence to the text to explain the warning.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>The workflow uses multiple packages and it's not always clear where each function comes from. This could be clarified e.g., by namespacing functions such as `limma::plotMDS()` instead of `plotMDS()`</p>
                            <p> 
                                <italic>We don&#x2019;t feel it is a particularly useful exercise to change every function to include the package name. Searching the help for any of the functions will inform users which package the function comes from. For example ?plotMDS.</italic>
                            </p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report14245">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.9514.r14245</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Peters</surname>
                        <given-names>Timothy J</given-names>
                    </name>
                    <xref ref-type="aff" rid="r14245a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r14245a1">
                    <label>1</label>Garvan Institute of Medical Research, Darlinghurst, NSW, Australia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>I am the primary author of the DMRcate package</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>15</day>
                <month>6</month>
                <year>2016</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2016 Peters TJ</copyright-statement>
                <copyright-year>2016</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport14245" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.8839.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This paper describes a workflow for processing, filtering and analysis of Illumina Infinium methylation array data. It showcases a reproducible pipeline integrating a suite of tools from Bioconductor for multi-faceted genomic insights. While none of the tools individually are novel, their integration into a sensible, reproducible pipeline is. I am recommending this manuscript for indexationfor 3 main reasons: 
                <list list-type="bullet">
                    <list-item>
                        <p>The tools contained therein and their application are in line with, or near, best practice. The workflow itself contains all the major steps that this reviewer usually uses for their methylation array processing.</p>
                    </list-item>
                    <list-item>
                        <p>An integrated workflow such as this will be valuable for novice and intermediate bioinformaticians who are tasked with processing methylation data. The number of caveats and sanity checks needed for appropriate biological interpretation is not trivial, and this workflow does a satisfactory job of outlining them.</p>
                    </list-item>
                    <list-item>
                        <p>The reproducible nature of this manuscript is a strength; it is very "coalface bioinformatics". Many published methods have very poor or buggy implementations and no effort is made to contextualise them in a given pipeline. Publication may set a precedent for other authors to give worked examples and context, which in this reviewer's opinion accelerates the path to best practice.</p>
                    </list-item>
                </list> </p>
            <p> Minor amendments needed: 
                <list list-type="order">
                    <list-item>
                        <p>I could not find any public links to the data files imported into this workflow. These ought to be provided.</p>
                    </list-item>
                    <list-item>
                        <p>a) The mathematical definition of&#x00a0;
                            <italic>&#x03b2;</italic> is given as&#x00a0;
                            <italic>&#x03b2; = M/(M +U + &#x03b1;)</italic>. While I realise 
                            <italic>&#x03b1;</italic>&#x00a0;is a fudge factor for offset purposes this is not clear to the lay reader and needs to be made so.</p>
                        <p> </p>
                        <p> b) Why are there no offsets for&#x00a0;
                            <italic>M</italic> or&#x00a0;
                            <italic>U </italic>in the calculation of&#x00a0;
                            <italic>M&#x00a0;</italic>values, especially since there is one for the calculation of 
                            <italic>&#x03b2;</italic>?&#x00a0;On (admittedly) rare occasions&#x00a0;
                            <italic>M</italic> or&#x00a0;
                            <italic>U&#x00a0;</italic>will be exactly zero and hence offsets need to be put in both the numerator and denominator of the ratio to be log-transformed, else a non-number will result.</p>
                    </list-item>
                    <list-item>
                        <p>A justification for the preference of&#x00a0;
                            <italic>M</italic> values over&#x00a0;
                            <italic>&#x03b2;&#x00a0;</italic>values for use in the MDS plots is needed, especially since the statement is made that "Beta values are generally preferable ... for graphical presentation". This reviewer's experience is that&#x00a0;
                            <italic>&#x03b2;&#x00a0;</italic>is much more common for use in PCA/MDS, and is certainly the standard for other methylation platforms e.g. bisulfite sequencing data.</p>
                    </list-item>
                    <list-item>
                        <p>Legends are needed for density plots in Figs. 3 and 8. I appreciate&#x00a0;
                            <italic>minfi</italic>&#x00a0;annoyingly puts the default legend in the top right, obscuring the hypermethylated mode, but a custom call to legend() ought to fix this.</p>
                    </list-item>
                    <list-item>
                        <p>Appropriate Y-axis labels are needed for Figs. 9 and 14.</p>
                    </list-item>
                </list>
            </p>
            <p>Reviewer Expertise:</p>
            <p>NA</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment2082-14245">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>Maksimovic</surname>
                            <given-names>Jovana</given-names>
                        </name>
                        <aff>Murdoch Childrens Research Institute, Australia</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>14</day>
                    <month>7</month>
                    <year>2016</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Thanks Tim for taking the time to review our paper.</p>
                <p> In response to your comments/suggestions: 
                    <list list-type="bullet">
                        <list-item>
                            <p>I could not find any public links to the data files imported into this workflow. These ought to be provided.</p>
                            <p> 
                                <italic>In addition to the references,&#x00a0;we have now included links to GEO for the data used and have also made a bundle of all the data available on Figshare which can now be used directly from within R to download the data and complete the workflow.&#x00a0;</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>a) The mathematical definition of&#x00a0;
                                <italic>&#x03b2;</italic>&#x00a0;is given as&#x00a0;
                                <italic>&#x03b2; = M/(M +U + &#x03b1;)</italic>. While I realise&#x00a0;
                                <italic>&#x03b1;</italic>&#x00a0;is a fudge factor for offset purposes this is not clear to the lay reader and needs to be made so.</p>
                            <p> b) Why are there no offsets for&#x00a0;
                                <italic>M</italic>&#x00a0;or&#x00a0;
                                <italic>U&#x00a0;</italic>in the calculation of&#x00a0;
                                <italic>M&#x00a0;</italic>values, especially since there is one for the calculation of&#x00a0;
                                <italic>&#x03b2;</italic>?&#x00a0;On (admittedly) rare occasions&#x00a0;
                                <italic>M</italic>&#x00a0;or&#x00a0;
                                <italic>U&#x00a0;</italic>will be exactly zero and hence offsets need to be put in both the numerator and denominator of the ratio to be log-transformed, else a non-number will result.</p>
                            <p> 
                                <italic>This has been clarified in the text. See also response to Davide Risso.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>A justification for the preference of&#x00a0;
                                <italic>M</italic>&#x00a0;values over&#x00a0;
                                <italic>&#x03b2;&#x00a0;</italic>values for use in the MDS plots is needed, especially since the statement is made that "Beta values are generally preferable ... for graphical presentation". This reviewer's experience is that&#x00a0;
                                <italic>&#x03b2;&#x00a0;</italic>is much more common for use in PCA/MDS, and is certainly the standard for other methylation platforms e.g. bisulfite sequencing data.</p>
                            <p> 
                                <italic>We&#x00a0;disagree that beta values should be used in principal components analysis. While plotMDS does produce a graphic, the function is performing a statistical analysis (i.e. principal components analysis), which is based on normal distribution theory. The same reasons for not performing differential methylation analysis on the beta values apply in this case (i.e. heteroscedasticity of the beta values). &#x00a0;</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Legends are needed for density plots in Figs. 3 and 8. I appreciate&#x00a0;
                                <italic>minfi</italic>&#x00a0;annoyingly puts the default legend in the top right, obscuring the hypermethylated mode, but a custom call to legend() ought to fix this.</p>
                            <p> 
                                <italic>These legends have been added as suggested.</italic>
                            </p>
                        </list-item>
                        <list-item>
                            <p>Appropriate Y-axis labels are needed for Figs. 9 and 14.</p>
                            <p> 
                                <italic>The Y-axis labels have been added.</italic>
                            </p>
                        </list-item>
                    </list>
                </p>
            </body>
        </sub-article>
    </sub-article>
</article>
