ddpcr: an R package and web application for analysis of droplet digital PCR data

Dean Attali; Roza Bidshahri; Charles Haynes; Jennifer Bryan

doi:10.12688/f1000research.9022.1

Home Browse ddpcr: an R package and web application for analysis of droplet digital...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Software Tool Article

ddpcr: an R package and web application for analysis of droplet digital PCR data

[version 1; peer review: 2 approved]

Dean Attali^1,2, Roza Bidshahri^2,3, Charles Haynes^2,3, Jennifer Bryan^2,4

PUBLISHED 17 Jun 2016

Author details Author details

¹ Bioinformatics Training Program, University of British Columbia, Vancouver, Canada
² Michael Smith Laboratories, University of British Columbia, Vancouver, Canada
³ Department of Chemical and Biological Engineering, University of British Columbia, Vancouver, Canada
⁴ Department of Statistics, University of British Columbia, Vancouver, Canada

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the RPackage gateway.

Abstract

Droplet digital polymerase chain reaction (ddPCR) is a novel platform for exact quantification of DNA which holds great promise in clinical diagnostics. It is increasingly popular due to its digital nature, which provides more accurate quantification and higher sensitivity than traditional real-time PCR. However, clinical adoption has been slowed in part by the lack of software tools available for analyzing ddPCR data. Here, we present ddpcr – a new R package for ddPCR visualization and analysis. In addition, ddpcr includes a web application (powered by the Shiny R package) that allows users to analyze ddPCR data using an interactive graphical interface.

Keywords

droplet digital PCR, shiny, bioinformatics, personalized medicine, rpackage, gating, Gaussian mixture models, kernel density estimates

Corresponding author: Jennifer Bryan

Competing interests: No competing interests were disclosed.

Grant information: CH receives support as a Canada Research Chair. This work was further supported by the Canadian Institutes of Health Research (CIHR) Bioinformatics Training Program scholarship.

Copyright: © 2016 Attali D et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

How to cite: Attali D, Bidshahri R, Haynes C and Bryan J. ddpcr: an R package and web application for analysis of droplet digital PCR data [version 1; peer review: 2 approved]. F1000Research 2016, 5:1411 (https://doi.org/10.12688/f1000research.9022.1) First published: 17 Jun 2016, 5:1411 (https://doi.org/10.12688/f1000research.9022.1) Latest published: 17 Jun 2016, 5:1411 (https://doi.org/10.12688/f1000research.9022.1)

Introduction

Droplet digital polymerase chain reaction (ddPCR) accurately quantifies targeted nucleic acid sequences (templates) by randomly partitioning sample DNA into isolated droplets, such that most droplets contain at most one template. The template within each droplet is then amplified and detected in a sequence-specific manner using a hydrolysis probe. The counting of droplets emitting a sequence-specific fluorescent signal permits the number of copies of that sequence present in the sample to be quantified with excellent sensitivity and precision. Different templates, such as wild-type and mutant alleles, may be quantified by using a uniquely labeled probe against each. The most commonly used reporter dyes on the probes are FAM (fluorescein) and HEX™, with the end-point fluorescence amplitudes for the two dyes measured by analyzing each droplet with a two-channel fluorescence detector¹.

ddPCR data readily lends itself to visualization as a two-dimensional scatter plot (Figure 1), in which the fluorescence amplitudes in both channels are plotted against each other for every droplet. In a ddPCR experiment designed to quantify two different templates, droplets ideally segregate into unique groups (clusters) that may include HEX-positive, FAM-positive, double-positive, and double-negative (empty) clusters². For example, distinct FAM-positive, double-positive, and empty droplet clusters can be seen in Figure 5B. In practice, some droplets record an ambiguous set of fluorescent signals that fall between the distinct positive and negative populations. Such droplets are termed “rain” and can be observed between all clusters. By gating the droplets into groups based on their fluorescence signals, the exact number of template-positive droplets can be counted to provide exact quantification in a digital form.

Figure 1.

Raw ddPCR data from a two-channel ddPCR experiment (well F05 from the sample dataset).

Motivation

Quantification of template abundance from raw ddPCR data begins with assigning each droplet to a unique cluster or to rain. The QuantaSoft program (Bio-Rad, Hercules, CA) is designed to perform these assignments either via manual gating, with the usual disadvantages of subjectivity and non-reproducibility, or automatic gating. The algorithm used in the latter case is proprietary and can produce unsatisfactory results, especially when applied to ddPCR data obtained from formalin-fixed paraffin-embedded (FFPE) samples, as exemplified in Figure 5A.

Two third-party tools for automatic gating of ddPCR data have been described to date: ‘definetherain’ by Jones et al.³ and ’ddpcRquant’ by Trypsteen et al.⁴. However, both are limited to single-channel ddPCR data and are therefore not applicable to increasingly common two-channel experiments such as shown in Figure 1. Given the lack of tools for such analyses, users must currently resort to manual droplet gating.

Methods

Overview

To improve automated droplet assignments as well as permit visualization of ddPCR datasets, we have developed ddpcr, an R package that can be used to explore, visualize, and analyze two-channel ddPCR data. The R language⁵ was chosen because it is open-source and cross-platform, which allows anyone to use it freely on any operating system. R is also a popular language in the field of computational biology, and is the main data analysis language for many scientists. To improve access and ease of use, we also implemented an interactive web application using Shiny⁶, through which one can run the analysis using a simple point-and-click interface.

ddpcr has been thoroughly tested using R versions 3.2.3 and 3.3.0 on both Windows 7 and Ubuntu 14.04.2 machines. However, the package is likely to run on any machine with a working installation of R.

Plate object

The most important object in the ddpcr package is the ddpcr_plate object, or simply referred to as the "plate object". A plate object represents all the data for experiments conducted on a 96-well PCR plate. It gets created either by loading ddPCR input data files (see ‘Data import’) into a new plate object, or by loading an existing plate object that was previously saved to disk. A plate object contains all the information required to analyze the droplets within each well of a particular ddPCR plate. A plate object is both the input and output of all the core analysis functions.

Workflow

To use the ddpcr package, it must first be installed and loaded.

install.packages("ddpcr")
library("ddpcr")

A very simple analysis workflow using a sample dataset can be performed using the following code, with the result of the code shown in Figure 2:

dir <– sample_data_dir()
my_data <– new_plate(dir, type = plate_types$fam_positive_pnpp)
my_data <– subset(my_data, "F05")
my_data <– analyze(my_data)
plot(my_data, show_drops_empty = TRUE, show_grid_labels = TRUE)

Figure 2.

ddPCR data from well F05 of the sample dataset analyzed using ddpcr.

While ddpcr contains dozens of functions, most analyses will follow a similar pattern: load ddPCR data into R using the new_plate() function, run the automated analysis using analyze(), and then explore the results using a variety of functions (Figure 3). The plot() function is used to visualize a dataset using ggplot2⁷, while the plate_meta() and plate_data() functions return the dataset’s metadata and droplet grouping data as R data frames, respectively. The save_plate() function can be called at any time to save the current state of the dataset to disk in a format that can be loaded back into ddpcr.

The example code above uses a sample dataset, but in order to use new data, ddPCR data must be exported from QuantaSoft, as described in the next section. For more complex analysis or customizing the analysis parameters, see the full list of functions available by running ?ddpcr.

Figure 3.

Basic workflow for analyzing ddPCR data using the ddpcr package.

Data import

Before beginning analysis on a novel dataset, the first step is to import the ddPCR droplet fluorescence data into R. The raw data obtained from the fluorescence detector is encoded in a proprietary format that cannot be read by any software other than QuantaSoft, so the data must first be opened in QuantaSoft and exported into an accessible file format. QuantaSoft offers an option to export the droplet event data as a set of CSV (comma-separated values) files, as well as an option to export a metadata file that contains information on each well (Supplementary Figure 1 and Supplementary Figure 2). These CSV files are used as the input to ddpcr.

Analysis algorithm

The analysis automatically gates droplets into unique clusters using kernel density estimation and Gaussian mixture models applied to the droplet fluorescence amplitudes. The full algorithm is explained in detail in a package vignette. The main analysis steps are:

Identify and exclude wells with a failed ddPCR reaction.
Identify and exclude outlier droplets, defined as those exhibiting a set of fluorescence amplitude signals characteristic of an error in the fluorescence readout.
Identify and exclude empty droplets — those displaying a set of signals indicative of complete absence of DNA template.
Calculate the starting concentration of each template in the sample, defined as the number of copies per microlitre of input.
Assign droplets into clusters by gating the droplets based on their fluorescence amplitudes. QuantaSoft’s automatic gating does not account for rain droplets and therefore can produce inaccurate results when the density of rain falls above a threshold. The gating algorithm in ddpcr accounts for rain and is therefore better able to distinguish clusters in clinical samples, such as FFPE samples, for which significant rain is often observed. Manual gating is also available in ddpcr to permit secondary verification of results.
Count the number of droplets in each cluster.

Implementation

Plate objects are lists. Every S3 object in R has a base type upon which it is built. The plate object is implemented as an S3 object of class ddpcr_plate with the R list as its base type. Using a list allows for an easy way to bundle together the several different R objects describing a plate into one. All information required to analyze a plate is part of the plate object. Every plate object contains a set of nine elements that together fully describe and reproduce the current state of the dataset: plate_data, plate_meta, name, params, status, clusters, steps, dirty, version.

Using S3 to override base generic functions. Since the plate object is an S3 object, it can benefit from the use of generic functions. There are three common generic functions that the plate object implements: print(), plot(), and subset(). The print() method does not take any extra arguments and is used to print a summary of a plate object in a visually appealing way to the console. It gives an overview of the most important parameters of the plate such as its name and size. The plot() method generates a scatter plot of every well in the dataset and can be highly customizable using the many arguments it supports. While the base plot() method in R uses base R graphics, the plot() method for ddpcr_plate objects uses the ggplot2 package⁷. The subset() generic is overridden by a method that is used to retain only a subset of wells from a larger plate.

Plate types. A ddPCR assay can be characterized by the droplet populations that are expected to arise after amplification. For example, in a (FAM⁺)/(FAM⁺HEX⁺) assay (such as Figure 1) it is expected that most of the non-empty droplets will either be FAM⁺HEX⁺ or FAM⁺, but not HEX⁺. Similarly, a (HEX⁺)/(FAM⁺HEX⁺) assay means that there are expected to be no droplets that are only FAM+. To describe these two types of assays, we define the term "PN/PP" (positive-negative/positive-positive). This name is a reflection of the expected populations of non-empty droplets: one population of singly-positive droplets (such as HEX⁺ or FAM⁺), and one population of double-positive droplets.

This characterization of a ddPCR experiment defines the plate type of a plate object, and it determines what type of analysis to run on the data. The default and most basic plate type is ddpcr_plate, which can be used for any ddPCR dataset. Running the analysis on a plate of this type will perform the first few analysis steps of identifying failed wells, outlier droplets, and empty droplets, but will not carry out the automated gating. Since in PN/PP-type experiments there is a rough expectation of where the droplets should be, automated gating can ensue on plates of that type.

Using S3 to support inheritance Inheritance means that every plate type has a parent plate type from which it inherits all its features, while specific behaviour can be added or modified. In ddpcr, transitive inheritance is implemented, which means that features are inherited from all ancestors rather than only the most immediate one. Multiple inheritance is not supported, meaning that each plate object can only have one parent.

The notion of inheritance is an important part of the ddpcr package, as it allows ddPCR data from different assay types to share many properties. For example, PN/PP assays are first treated using the analysis steps common to all ddPCR experiments, and then gated with an assay-specific step, so PN/PP assays can be thought of as inheriting the analysis from general ddPCR assays. Furthermore, the two types of PN/PP assays share many similarities, so they both inherit from a common PNPP plate type. Another benefit of inheritance in ddpcr is that it allows users to easily extend the functionality of the package by adding custom ddPCR plate types to gate different types of experiments. More information, including a fully worked example, on how to add a new plate type can be found in the package vignette (see ‘Software availability’).

Shiny web application

The ddpcr package includes a web application that allows users to perform an analysis of ddPCR data in an interactive visual environment. The web application, written using the Shiny package v0.11⁶, implements most of the features available in the ddpcr package and makes them accessible via a simple point-and-click interface. The Shiny application can be a useful tool for persons not comfortable with R programming or simply as a more convenient way to perform an analysis. However, since the web application only supports a curated subset of the ddpcr functions, it is not as powerful as using the command-line interface.

The ddpcr Shiny application includes four main tabs that mimic the natural flow of a ddPCR analysis (Figure 4): upload a dataset, configure analysis parameters, analyze the plate, and explore the results. At any point during the session, the current plate object can be downloaded and saved, and can be loaded into either the R command-line or the web application at a later time to continue the analysis.

The application is freely available online at http://daattali.com/shiny/ddpcr and is hosted on a server located in San Francisco, California. All data that is uploaded to the application is deleted when a user session ends, and none of the data is stored permanently. However, some users may prefer to run the application locally, which can be done using the ddpcr::launch() function.

Figure 4.

Screenshot from the ddpcr web application during an analysis of the sample ddPCR dataset.

Use case

Dataset 1.Raw ddPCR data from application of the ddPCR assay against BRAF-V600 mutations.

This data can be loaded and displayed in QuantaSoft™. Column 12 on the plate is from a different experiment and is not considered part of the dataset.

Dataset 2.The set of exported CSV files of the data presented in Dataset 1.

We have applied ddpcr to data (Dataset 1) from a novel ddPCR assay against somatic point mutations in the BRAF-V600 codon that was applied to FFPE specimens from a cohort of colorectal cancer (CRC) patients⁸. V600 mutations are observed in approximately 10% of colorectal tumours⁹ and their detection in CRC patients helps determine disease prognosis and treatment regimen. Through its droplet gating algorithm, ddpcr accurately identified droplet clusters and the number of droplets within each to provide the information needed to compute the frequency of mutated BRAF genes (Supplementary Figure 3).

To assess the accuracy of results from ddpcr, we compared BRAF-V600 mutation frequencies determined from the output of ddpcr with results obtained by two independent methods. V600 mutation frequencies computed from automated ddpcr results were within 3% of those obtained by manual analysis of the ddPCR data by an experienced operator (Supplementary Figure 4 and Supplementary Table 1). In addition, the BRAF-V600 status for each sample in the entire cohort was classified as mutant or wild-type by a certified pathologist using an immunohistochemical staining assay⁸. We obtained complete agreement between the pathologist’s binary classification of BRAF status and that determined using ddpcr.

We also analyzed the same dataset using QuantaSoft version 1.6.6. FAM-positive and double-positive droplets were not recognized as distinct clusters in 9 out of the 16 mutant-positive BRAF samples (Figure 5A).

Figure 5.

Comparison between droplet gating in (A) QuantaSoft and (B) ddpcr. Both tools analyzed the same ddPCR experiment (well F05) from an assay designed to quantify wild-type (double-positive) and mutant (FAM-positive) alleles of the BRAF gene. (A) QuantaSoft failed to assign the double-positive and FAM-positive droplets into unique clusters, instead assigning all droplets recording a high FAM signal to a single cluster; (B) ddpcr assigned droplets into one of three uniquely identified clusters (double-positive (green), FAM-positive (orange), and empty (black)), or rain (blue).

Discussion

We present ddpcr, an R package that allows users to analyze ddPCR data and explore the results, both programmatically using R and via an interactive web application. To demonstrate clinical utility, a case study performed on a cohort of CRC patients showed that BRAF-V600 mutation frequencies determined using ddpcr are verified using two independent methods. The analysis runtime was 17 seconds, observed on a 64-bit Ubuntu 14.04.2 machine with 512MB of RAM and a single core Intel(R) Xeon(R) CPU E5-2630 at 2.30GHz. The package documentation includes details on extending the package, explanations of the algorithms used, and a walkthrough of a fully worked example.

Data availability

F1000Research: Dataset 1. Raw ddPCR data from application of the ddPCR assay against BRAF-V600 mutations, 10.5256/f1000research.9022.d126032¹⁰

F1000Research: Dataset 2. The set of exported CSV files of the data presented in Dataset 1., 10.5256/f1000research.9022.d126033¹¹

Dataset 1 is also available as a sample dataset within the ddpcr package. To access the data via the web application, select the tab Use sample dataset, choose Large dataset, and then click Load data. To access the data in R, run the following command to store the dataset as a plate object: my_data <- ddpcr::sample_plate("large").

Software availability

Software available from: http://cran.r-project.org/package=ddpcr or https://github.com/daattali/ddpcr

The free web tool can be accessed online at: (http://daattali.com/shiny/ddpcr); or run locally via the ddpcr package with the command ddpcr::launch().

Latest source code: https://github.com/daattali/ddpcr

Archived source code at time of publication: https://dx.doi.org/10.6084/m9.figshare.3423725¹²

License: MIT

Author contributions

DA wrote the code and produced the figures. RB ran the ddPCR experiments. JB provided ideas and feedback for analysis algorithm. CH provided ideas and feedback for the functionality of the ddpcr package. DA wrote the manuscript with feedback from JB, RB, and CH. All authors approved the final manuscript.

Competing interests

No competing interests were disclosed.

Grant information

CH receives support as a Canada Research Chair. This work was further supported by the Canadian Institutes of Health Research (CIHR) Bioinformatics Training Program scholarship.

Acknowledgments

We would like to thank Dr. Ryan Brinkman and his lab members for their time and advice.

Supplementary material

Figure 1. Exporting droplet fluorescence data from QuantaSoft.

Click here to access the data

Figure 2. Exporting plate metadata from QuantaSoft.

Click here to access the data

Figure 3. Automated ddpcr droplet gating results for raw assay output for the cohort of 32 CRC patient samples. The numbers show the calculated BRAF-V600 mutation frequency, defined as the ratio of FAM-positive droplets to the sum of FAM-positive and double-positive droplets. Background colours: green = sample is classified as wild-type, purple = sample is classified as mutant, grey = failed ddPCR run.

Click here to access the data

Figure 4. Comparison of the mutation frequency in each patient sample as calculated automatically by ddpcr vs that determined independently manually by an expert technician. The grey line represents the y = x line.

Click here to access the data

Table 1. Comparison of the mutation frequency in each patient sample as calculated automatically by ddpcr vs that determined independently manually by an expert technician.

Click here to access the data

Faculty Opinions recommended

References

1. Hindson BJ, Ness KD, Masquelier DA, et al.: High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem. 2011; 83(22): 8604–8610. PubMed Abstract | Publisher Full Text | Free Full Text
2. Bizouarn F: Introduction to digital PCR. Methods Mol Biol. 2014; 1160: 27–41. PubMed Abstract | Publisher Full Text
3. Jones M, Williams J, Gärtner K, et al.: Low copy target detection by Droplet Digital PCR through application of a novel open access bioinformatic pipeline, ‘definetherain’. J Virol Methods. 2014; 202(100): 46–53. PubMed Abstract | Publisher Full Text | Free Full Text
4. Trypsteen W, Vynck M, De Neve J, et al.: ddpcRquant: threshold determination for single channel droplet digital PCR experiments. Anal Bioanal Chem. 2015; 407(19): 5827–5834. PubMed Abstract | Publisher Full Text
5. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2016. Reference Source
6. Chang W, Cheng J, Allaire JJ, et al.: shiny: Web Application Framework for R. R package version 0.13.2. 2016. Reference Source
7. Wickham H: ggplot2: Elegant Graphics for Data Analysis. Use R. Springer-Verlag New York, 2009. Publisher Full Text
8. Bidshahri R, Attali D, Fakhfakh K, et al.: Quantitative Detection and Resolution of BRAF V600 Status in Colorectal Cancer Using Droplet Digital PCR and a Novel Wild-Type Negative Assay. J Mol Diagn. 2016; 18(2): 190–204. PubMed Abstract | Publisher Full Text
9. Tol J, Nagtegaal ID, Punt CJ: BRAF mutation in metastatic colorectal cancer. N Engl J Med. 2009; 361(1): 98–99. PubMed Abstract | Publisher Full Text
10. Attali D, Bidshahri R, Haynes C, et al.: Dataset 1 in: ddpcr: an R package and web application for analysis of droplet digital PCR data. F1000Research. 2016. Data Source
11. Attali D, Bidshahri R, Haynes C, et al.: Dataset 2 in: ddpcr: an R package and web application for analysis of droplet digital PCR data. F1000Research. 2016. Data Source
12. Attali D: ddpcr-1.3.zip. figshare. 2016. Data Source

Comments on this article Comments (1)

Version 1

VERSION 1 PUBLISHED 17 Jun 2016

Reader Comment 31 Aug 2016

Stefan Rödiger, Brandenburg University of Technology Cottbus - Senftenberg, Germany

31 Aug 2016

Reader Comment
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been ... Continue reading
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been slowed in part by the lack of software tools available for analyzing ddPCR data.”. Actually, the environment of analysis of dPCR data is much larger than described. For the sake of completeness we would like to raise the awareness to the dpcR package, which is the first R package (available from CRAN since September 2013) devoted to analysis of dPCR data. The functionality of the dpcR packages is targeted at users of droplet dPCR and chamber dPCR. Similarly, to ddpcr offers dpcR a shiny GUI, visualization tools (ggplot2, ...), simulation tools, import functionality and a collection of data. For further reading please refer to following publications:

Rödiger S, Burdukiewicz M, Blagodatskikh KA, Schierack P. R as an Environment for the Reproducible Analysis of DNA Amplification Experiments. The R Journal [Internet]. 2015;7(2):127–50. Available from: http://journal.r-project.org/archive/2015-1/RJ-2015-1.pdf

Burdukiewicz M, Rödiger S, Sobczyk P, Menschikowski M, Schierack P, Mackiewicz P. Methods for comparing multiple digital PCR experiments. Biomol Detect Quantif. 2016 Sep;9:14–9. http://www.sciencedirect.com/science/article/pii/S2214753516300171

Burdukiewicz M, Spiess AN, Schierack P and Rödiger S. dpcR: an R package for the analysis of digital PCR [v1; not peer reviewed]. F1000Research 2016, 5:215 (poster) (doi:10.7490/f1000research.1111325.1) http://f1000research.com/posters/5-215

Michal Burdukiewicz, Stefan Rödiger, Bart Jacobs, Piotr Sobczyk. Digital PCR Analysis [R package dpcR version 0.3]. [cited 2016 Jun 30]; Available from: http://CRAN.R-project.org/package=dpcR

In addition, there are further studies using the R statistical computing environment for dPCR data, which might be of interest for the presented study:

Vynck M, Vandesompele J, Nijs N, Menten B, De Ganck A, Thas O. Flexible analysis of digital PCR experiments using generalized linear mixed models. Biomol Detect Quantif. [Internet]. 2016 Sep [cited 2016 Aug 23];9:1–13. http://www.sciencedirect.com/science/article/pii/S2214753516300146

Dorazio RM, Hunter ME. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments. Anal Chem. 2015 Nov 3;87(21):10886–93. http://pubs.acs.org/doi/10.1021/acs.analchem.5b02429

We hope this information is a valuable information for the work presented by Attali et al..

Respectfully,

Stefan Rödiger on behalf of all package authors
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been slowed in part by the lack of software tools available for analyzing ddPCR data.”. Actually, the environment of analysis of dPCR data is much larger than described. For the sake of completeness we would like to raise the awareness to the dpcR package, which is the first R package (available from CRAN since September 2013) devoted to analysis of dPCR data. The functionality of the dpcR packages is targeted at users of droplet dPCR and chamber dPCR. Similarly, to ddpcr offers dpcR a shiny GUI, visualization tools (ggplot2, ...), simulation tools, import functionality and a collection of data. For further reading please refer to following publications:

Rödiger S, Burdukiewicz M, Blagodatskikh KA, Schierack P. R as an Environment for the Reproducible Analysis of DNA Amplification Experiments. The R Journal [Internet]. 2015;7(2):127–50. Available from: http://journal.r-project.org/archive/2015-1/RJ-2015-1.pdf

Burdukiewicz M, Rödiger S, Sobczyk P, Menschikowski M, Schierack P, Mackiewicz P. Methods for comparing multiple digital PCR experiments. Biomol Detect Quantif. 2016 Sep;9:14–9. http://www.sciencedirect.com/science/article/pii/S2214753516300171

Burdukiewicz M, Spiess AN, Schierack P and Rödiger S. dpcR: an R package for the analysis of digital PCR [v1; not peer reviewed]. F1000Research 2016, 5:215 (poster) (doi:10.7490/f1000research.1111325.1) http://f1000research.com/posters/5-215

Michal Burdukiewicz, Stefan Rödiger, Bart Jacobs, Piotr Sobczyk. Digital PCR Analysis [R package dpcR version 0.3]. [cited 2016 Jun 30]; Available from: http://CRAN.R-project.org/package=dpcR

In addition, there are further studies using the R statistical computing environment for dPCR data, which might be of interest for the presented study:

Vynck M, Vandesompele J, Nijs N, Menten B, De Ganck A, Thas O. Flexible analysis of digital PCR experiments using generalized linear mixed models. Biomol Detect Quantif. [Internet]. 2016 Sep [cited 2016 Aug 23];9:1–13. http://www.sciencedirect.com/science/article/pii/S2214753516300146

Dorazio RM, Hunter ME. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments. Anal Chem. 2015 Nov 3;87(21):10886–93. http://pubs.acs.org/doi/10.1021/acs.analchem.5b02429

We hope this information is a valuable information for the work presented by Attali et al..

Respectfully,

Stefan Rödiger on behalf of all package authors
Competing Interests: There are no competing Interests to disclose Close
Report a concern
Comment

Author details Author details

Competing interests

No competing interests were disclosed.

Grant information

CH receives support as a Canada Research Chair. This work was further supported by the Canadian Institutes of Health Research (CIHR) Bioinformatics Training Program scholarship.

Article Versions (1)

version 1

Published: 17 Jun 2016, 5:1411

https://doi.org/10.12688/f1000research.9022.1

© 2016 Attali D et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Attali D, Bidshahri R, Haynes C and Bryan J. ddpcr: an R package and web application for analysis of droplet digital PCR data [version 1; peer review: 2 approved]. F1000Research 2016, 5:1411 (https://doi.org/10.12688/f1000research.9022.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 17 Jun 2016

Views

Reviewer Report 14 Sep 2016

Stephanie L. Hazlitt, Government of British Columbia, Victoria, BC, Canada

Andy Teucher, Ministry of Environment - Province of British Columbia, Victoria, BC, Canada

Approved

https://doi.org/10.5256/f1000research.9706.r15716

The software tool article 'ddpcr: an R package and web application for analysis of droplet digital PCR data' is well-written and includes sufficient detail for the reader to assess the tool's construction, implementation and outputs. The ddpcr R package (v1.5) functions well and includes clearly-written, detailed vignettes to support the user and understand what is happening 'under the hood'. In addition, the source code is openly available on GitHub, which allows advanced R users to investigate details of the implementation. In general, we recommended more explicit links in the paper to supporting vignettes and the package README.

In addition to ddpcr being a useful tool for standard data science steps (get data, visualize data, analyze data), it also provides the ddPCR community with a well-documented, new analytical methodology for ddpcr clustering and hence opportunities for assessing robustness around clustering techniques in general in this field. This contribution could be made more clear in the Introduction and/or Motivation sections.

Figure 5--referenced multiple times in the Introduction--could be moved alongside the text for the reader.

The Methods structure could be arranged to follow standard data science steps and Figure 3: (1) get data [raw, export, csv] (2) import data [new_plate()] (3)analyze data [analyze()]. There is no section on (4) visualize data [plot()], and the likely user cycle of analyze-visualize-change analysis parameters-analyze-visualize. With clustering, I am guessing this will be a common and important set of repeat steps. I would end the section with the example workflow code to reinforce the steps.

The Analysis sub-section might benefit from some supporting references for the selected analyses (e.g. kernel density estimation, Gaussian mixture models), url links to the ddpcr::analysis vignette in GitHub, and if the paper word-limit allowed, a bit more detail in the paper itself, especially for the 'assign droplets into clusters' step.

The examples show rain being identified in the FAM (empty vs filled; vertical direction), but not between FAM+ and FAM+HEX+ (mutant vs wildtype; horizontal direction). Is this because the algorithm doesn't allow for this, or because all of the droplets in that range could be classified as FAM+ or FAM+HEX+?

Also in the Analysis sub-section, a description of the next_step() function would be useful for those who want to understand each step in the process. A potential enhancement to the package would be a plot function/method for each step that visually displays the results of each step in the analysis.

It might be useful to synchronize the language used to describe the clusters with the attribute language in output objects in the package. For example, in the paper and vignettes clusters are "double positive, FAM+, HEX+, or double negative". In clusters() the clusters are POSITIVE NEGATIVE RAIN EMPTY etc. The plate_data function outputs clusters as numbers 1 through 7. The plate_meta function outputs clusters as mutant_num and wildtype_num. With a bit of sleuthing it is clear you can define the resulting cluster names with new_plate(), which is useful but may not be intuitive for many users.

Other minor suggestions for potential enhancements after a test drive of ddpcr:
- Plotting:
      - an addition of a Legend for the plot() would be a very useful
      - show_grid_labels = TRUE as default?
      - visually distinguish rain from empty? Figure 5 in the paper shows this, but the plot
        generated by running the code in the 'Workflow' section does not.
- The plate_meta data.frame in a ddpcr_plate object could be a tibble for better printing
- The plate_meta data.frame (after analysis) could also display the rain count

Competing Interests: No competing interests were disclosed.

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 04 Jul 2016

Timothy J. Triche Jr, Jane Anne Nohl Division of Hematology, USC/Norris Comprehensive Cancer Center, Keck School of Medicine of USC, Los Angeles, CA, USA

Approved

https://doi.org/10.5256/f1000research.9706.r14448

The ddpcr software is extensively documented and works as described. There are some minor changes that might be relevant (e.g. in ddpcr v1.4, the plot contains a percentage estimate in the lower right hand quadrant which is not shown in figure 1, and the X- and Y-axes are labelled slightly differently in the actual output from running the example. Overall, however, the package offers both a powerful tool for end users which advances the state of the art, and also a foundation for further algorithmic development. Cross-pollination between the flow cytometry community and ddpcr in the R ecosystem will likely lead to advances that would not otherwise have occurred.

Figure 3 should be moved alongside the text in the paragraph "Analysis algorithm".

Figure 5 is a powerful demonstration of the rationale for the library's creation and could perhaps stand to be moved up in the body of the paper. Standard practice is to provide a reason for the user to care about a method (e.g. figure 5), then describe the nuts and bolts of the implementation. There is a reason that this is standard practice.

Time permitting, this reviewer has sought and obtained some much more challenging samples. In a revised or separate publication, it would be of interest to compare the software's performance on these ddPCR runs (where 1 in 10000 cells carries a mutation, but that 1 in 10000 is verified by multiple orthogonal sequencing methods) and determine whether the algorithmic flexibility of the ddpcr package would allow users to automate what is currently a highly technical process for detecting rare mutations. Unfortunately, the manufacturer has been unhelpful in converting this data to a usable format within the timeline of the review, so I cannot in good conscience delay it further.

Outside of the scope of this paper, but relevant to the introduction, reproducibility and comparability would be much improved if the raw .QLP files could be parsed by R itself, making exchange and sharing of data far easier. Inquiries placed to Bio-Rad regarding the file format and the QuantaSoft package went unanswered, suggesting that (as with Illumina and their .IDAT format) the eventual solution will be to reverse engineer the format and brute-force the problem. It is unfortunate that some scientific instrument vendors appear to value imaginary profits over actual scientific merit, but such is life. Reverse engineering the file format is not a reasonable request here, but represents a future direction to further improve reproducibility of this important analytical technique. Existing tools rely upon ddpcr data provided in .CSV format; the first to elide this requirement (and that of a $5000 software package merely to review output) will be a notable step towards transparency for an increasingly powerful genotyping technique.

In conclusion, the ddpcr implementation and its extensive documentation (here in this paper and in the copious examples provided with the package) represent a solid foundation for further methodological improvements to an important assay platform.

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Author Response 19 Aug 2016

Dean Attali, Bioinformatics Training Program, University of British Columbia, Vancouver, Canada

19 Aug 2016

Author Response

Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Competing Interests: No competing interests were disclosed.
Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 19 Aug 2016

Dean Attali, Bioinformatics Training Program, University of British Columbia, Vancouver, Canada

19 Aug 2016

Author Response

Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Competing Interests: No competing interests were disclosed.
Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Thank you for the thorough and prompt review. I will address the comments once I get a review from the second reviewer (it's taking longer than expected to find someone!)
Competing Interests: No competing interests were disclosed. Close
Report a concern

Comments on this article Comments (1)

Version 1

VERSION 1 PUBLISHED 17 Jun 2016

Reader Comment 31 Aug 2016

Stefan Rödiger, Brandenburg University of Technology Cottbus - Senftenberg, Germany

31 Aug 2016

Reader Comment
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been ... Continue reading
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been slowed in part by the lack of software tools available for analyzing ddPCR data.”. Actually, the environment of analysis of dPCR data is much larger than described. For the sake of completeness we would like to raise the awareness to the dpcR package, which is the first R package (available from CRAN since September 2013) devoted to analysis of dPCR data. The functionality of the dpcR packages is targeted at users of droplet dPCR and chamber dPCR. Similarly, to ddpcr offers dpcR a shiny GUI, visualization tools (ggplot2, ...), simulation tools, import functionality and a collection of data. For further reading please refer to following publications:

Rödiger S, Burdukiewicz M, Blagodatskikh KA, Schierack P. R as an Environment for the Reproducible Analysis of DNA Amplification Experiments. The R Journal [Internet]. 2015;7(2):127–50. Available from: http://journal.r-project.org/archive/2015-1/RJ-2015-1.pdf

Burdukiewicz M, Rödiger S, Sobczyk P, Menschikowski M, Schierack P, Mackiewicz P. Methods for comparing multiple digital PCR experiments. Biomol Detect Quantif. 2016 Sep;9:14–9. http://www.sciencedirect.com/science/article/pii/S2214753516300171

Burdukiewicz M, Spiess AN, Schierack P and Rödiger S. dpcR: an R package for the analysis of digital PCR [v1; not peer reviewed]. F1000Research 2016, 5:215 (poster) (doi:10.7490/f1000research.1111325.1) http://f1000research.com/posters/5-215

Michal Burdukiewicz, Stefan Rödiger, Bart Jacobs, Piotr Sobczyk. Digital PCR Analysis [R package dpcR version 0.3]. [cited 2016 Jun 30]; Available from: http://CRAN.R-project.org/package=dpcR

In addition, there are further studies using the R statistical computing environment for dPCR data, which might be of interest for the presented study:

Vynck M, Vandesompele J, Nijs N, Menten B, De Ganck A, Thas O. Flexible analysis of digital PCR experiments using generalized linear mixed models. Biomol Detect Quantif. [Internet]. 2016 Sep [cited 2016 Aug 23];9:1–13. http://www.sciencedirect.com/science/article/pii/S2214753516300146

Dorazio RM, Hunter ME. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments. Anal Chem. 2015 Nov 3;87(21):10886–93. http://pubs.acs.org/doi/10.1021/acs.analchem.5b02429

We hope this information is a valuable information for the work presented by Attali et al..

Respectfully,

Stefan Rödiger on behalf of all package authors
The presented work by Attali et al. is an interesting contribution to the growing knowledge about dPCR and the analysis thereof. The authors state that the “...clinical adoption has been slowed in part by the lack of software tools available for analyzing ddPCR data.”. Actually, the environment of analysis of dPCR data is much larger than described. For the sake of completeness we would like to raise the awareness to the dpcR package, which is the first R package (available from CRAN since September 2013) devoted to analysis of dPCR data. The functionality of the dpcR packages is targeted at users of droplet dPCR and chamber dPCR. Similarly, to ddpcr offers dpcR a shiny GUI, visualization tools (ggplot2, ...), simulation tools, import functionality and a collection of data. For further reading please refer to following publications:

Rödiger S, Burdukiewicz M, Blagodatskikh KA, Schierack P. R as an Environment for the Reproducible Analysis of DNA Amplification Experiments. The R Journal [Internet]. 2015;7(2):127–50. Available from: http://journal.r-project.org/archive/2015-1/RJ-2015-1.pdf

Burdukiewicz M, Rödiger S, Sobczyk P, Menschikowski M, Schierack P, Mackiewicz P. Methods for comparing multiple digital PCR experiments. Biomol Detect Quantif. 2016 Sep;9:14–9. http://www.sciencedirect.com/science/article/pii/S2214753516300171

Burdukiewicz M, Spiess AN, Schierack P and Rödiger S. dpcR: an R package for the analysis of digital PCR [v1; not peer reviewed]. F1000Research 2016, 5:215 (poster) (doi:10.7490/f1000research.1111325.1) http://f1000research.com/posters/5-215

Michal Burdukiewicz, Stefan Rödiger, Bart Jacobs, Piotr Sobczyk. Digital PCR Analysis [R package dpcR version 0.3]. [cited 2016 Jun 30]; Available from: http://CRAN.R-project.org/package=dpcR

In addition, there are further studies using the R statistical computing environment for dPCR data, which might be of interest for the presented study:

Vynck M, Vandesompele J, Nijs N, Menten B, De Ganck A, Thas O. Flexible analysis of digital PCR experiments using generalized linear mixed models. Biomol Detect Quantif. [Internet]. 2016 Sep [cited 2016 Aug 23];9:1–13. http://www.sciencedirect.com/science/article/pii/S2214753516300146

Dorazio RM, Hunter ME. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments. Anal Chem. 2015 Nov 3;87(21):10886–93. http://pubs.acs.org/doi/10.1021/acs.analchem.5b02429

We hope this information is a valuable information for the work presented by Attali et al..

Respectfully,

Stefan Rödiger on behalf of all package authors
Competing Interests: There are no competing Interests to disclose Close
Report a concern
Comment

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 17 Jun 16	read	read

Timothy J. Triche Jr, USC/Norris Comprehensive Cancer Center, Keck School of Medicine of USC, Los Angeles, USA
Stephanie L. Hazlitt, Government of British Columbia, Victoria, Canada

Andy Teucher, Ministry of Environment - Province of British Columbia, Victoria, Canada

Comments on this article

All Comments(1)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

46 Views

14 Sep 2016 | for Version 1

Stephanie L. Hazlitt, Government of British Columbia, Victoria, BC, Canada

Andy Teucher, Ministry of Environment - Province of British Columbia, Victoria, BC, Canada

46 Views Cite this report Responses(0)

Approved

Competing Interests

No competing interests were disclosed.

We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

53 Views

04 Jul 2016 | for Version 1

Timothy J. Triche Jr, Jane Anne Nohl Division of Hematology, USC/Norris Comprehensive Cancer Center, Keck School of Medicine of USC, Los Angeles, CA, USA

53 Views Cite this report Responses(1)

Approved

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (1)

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

Click here to access the data.

Downloaded data do not display as expected? Download the data

Click here to access the data.

Downloaded data do not display as expected? Download the data

[1] 1. Hindson BJ, Ness KD, Masquelier DA, et al.: High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem. 2011; 83(22): 8604–8610. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Bizouarn F: Introduction to digital PCR. Methods Mol Biol. 2014; 1160: 27–41. PubMed Abstract | Publisher Full Text

[3] 3. Jones M, Williams J, Gärtner K, et al.: Low copy target detection by Droplet Digital PCR through application of a novel open access bioinformatic pipeline, ‘definetherain’. J Virol Methods. 2014; 202(100): 46–53. PubMed Abstract | Publisher Full Text | Free Full Text

[4] 4. Trypsteen W, Vynck M, De Neve J, et al.: ddpcRquant: threshold determination for single channel droplet digital PCR experiments. Anal Bioanal Chem. 2015; 407(19): 5827–5834. PubMed Abstract | Publisher Full Text

[5] 5. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2016. Reference Source

[6] 6. Chang W, Cheng J, Allaire JJ, et al.: shiny: Web Application Framework for R. R package version 0.13.2. 2016. Reference Source

[7] 7. Wickham H: ggplot2: Elegant Graphics for Data Analysis. Use R. Springer-Verlag New York, 2009. Publisher Full Text

[8] 8. Bidshahri R, Attali D, Fakhfakh K, et al.: Quantitative Detection and Resolution of BRAF V600 Status in Colorectal Cancer Using Droplet Digital PCR and a Novel Wild-Type Negative Assay. J Mol Diagn. 2016; 18(2): 190–204. PubMed Abstract | Publisher Full Text

[9] 9. Tol J, Nagtegaal ID, Punt CJ: BRAF mutation in metastatic colorectal cancer. N Engl J Med. 2009; 361(1): 98–99. PubMed Abstract | Publisher Full Text

[10] 10. Attali D, Bidshahri R, Haynes C, et al.: Dataset 1 in: ddpcr: an R package and web application for analysis of droplet digital PCR data. F1000Research. 2016. Data Source

[11] 11. Attali D, Bidshahri R, Haynes C, et al.: Dataset 2 in: ddpcr: an R package and web application for analysis of droplet digital PCR data. F1000Research. 2016. Data Source

[12] 12. Attali D: ddpcr-1.3.zip. figshare. 2016. Data Source

ddpcr: an R package and web application for analysis of droplet digital PCR data

Abstract

Keywords

Introduction

Figure 1.

Motivation

Methods

Overview

Plate object

Workflow

Figure 2.

Figure 3.

Data import

Analysis algorithm

Implementation

Shiny web application

Figure 4.

Use case

Figure 5.

Discussion

Data availability

Software availability

Author contributions

Competing interests

Grant information

Acknowledgments

Supplementary material

References

Comments on this article Comments (1)

Open Peer Review

Comments on this article Comments (1)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

The problem

How to fix it

The problem

How to fix it

Competing Interests Policy

Stay Updated