ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Better than we thought? The diagnostic performance of an influenza point-of-care test in children, a Bayesian re-analysis

[version 1; peer review: 1 approved with reservations, 1 not approved]
PUBLISHED 18 Jan 2017
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Background: Point-of-care tests (POCTs) for influenza have been criticised for their diagnostic accuracy, with clinical use limited by low sensitivity. These criticisms are based on diagnostic-accuracy studies that often use the questionable assumption of an infallible gold standard. Bayesian latent class modelling can estimate diagnostic performance without this assumption. Methods: Data extracted from published diagnostic-accuracy studies comparing the QuickVue® influenza A+B influenza POCT to reverse-transcriptase polymerase chain reaction (RT-PCR) in two different populations were re-analysed. Classical and Bayesian latent class methods were applied using the Modelling for Infectious diseases CEntre (MICE) web-based application. Results: Under classical analyses the estimated sensitivity and specificity of the QuickVue® were 66.9% (95% confidence interval (CI) 61.4-71.9) and 97.8% (95% CI 95.7-98.9), respectively. Bayesian latent class models estimated sensitivity of 97.8% (95% credible interval (CrI) 82.1-100) and specificity of 98.5% (95% CrI 96.5-100). Conclusions: Data from studies comparing the QuickVue® point-of-care test to RT-PCR are compatible with better diagnostic performance than previously reported.

Keywords

Bayesian latent class models, influenza, diagnostic accuracy, point-of-care test, near-patient test, primary care, paediatrics

Introduction

Influenza is an infectious disease of global importance and is a target of many near-patient tests1,2. These tests have been criticized for reported low sensitivity. This relatively poor ability to ‘rule out’ infection has been given as a reason to avoid their use in clinical practice, and instead develop better tests3. There are reasons to suspect some diagnostic-accuracy studies of point-of-care tests (POCTs) may have systematically underestimated sensitivity. If this is the case, the diagnostic accuracy of existing tests may be better than previously thought, with implications for clinical practice and test development.

Classic diagnostic-accuracy studies compare the performance of the index (new) test, with another reference (pre-existing) test, on samples from the same patients. Although rarely explicitly stated, the reference test is assumed to be an infallible ‘gold standard’. Under this assumption, whenever the index test and the reference test results differ, the index test is assumed to be wrong. This prevents the index test outperforming the reference, and may systematically underestimate test performance. Many diagnostic-accuracy studies of point-of-care tests for influenza have used these classical methods, raising the possibility that their diagnostic performance have been artificially suppressed4.

Established techniques for when a ‘gold standard’ is not available include: constructing a reference standard by multiple panels of tests, re-testing discrepant results, and statistical modelling5. Bayesian latent class models are one such statistical technique6,7. Unlike many other methods, they offer an opportunity to retrospectively analyse existing data, providing a test has been compared to the same reference standard in more than one population6. As far as I can tell, this study is the first attempt at Bayesian re-analysis of point-of-care tests for influenza.

This paper aims to examine the extent to which published estimates of influenza point-of-care test accuracy are constrained by the infallible gold standard assumption, with a view to informing clinical practice, and future diagnostic-accuracy studies.

Methods

Published data were re-analysed using Bayesian latent class modelling and classical analysis. Data were extracted from two studies8,9 comparing the same reference and index tests (reverse-transcriptase polymerase chain reaction (RT-PCR) vs. QuickVue® influenza A+B influenza), in two separate primary care populations.

Analyses were performed using the free online application Modelling for Infectious diseases CEntre (MICE; http://mice.tropmedres.ac/home.aspx), which has been described elsewhere, and runs parallel analyses of Bayesian latent class models and classical frequentist statistics for diagnostic test accuracy7. Data are input into MICE via a simple online portal, and results are stored online or emailed to the user.

MICE employs Markov Chain Monte Carlo (MCMC) simulations. These use the data provided to estimate all unknown parameters: the specificity and sensitivity of both reference and index tests, and the prevalence in the study population(s). The predicted combinations of test results are compared to the actual observed data, and the process is iterated, ideally until the estimates converge on the best fitting values for specificity, sensitivity and prevalence. MICE presents these results in the form of a table, with further graphs of the iterated estimates to allow the user to check convergence of MCMC chains, and Bayesian P values to allow the fit of the final model to the observed data to be assessed.

For this study the ‘two tests in one population’ model was selected and default values were used. Under the default settings non-informative priors (beta distribution 0.5, 0.5) are used to initiate the analysis, with the specificity of both tests constrained to above 40%. MCMC simulation used default initial values for diagnostic accuracy: (90% and 30% for prevalence, 90% and 70% for sensitivity, 90% and 99% for specificity). The analysis ran for 5000 iterations of pre-analysis adjustment (burn in), and 20,000 iterations.

Results

Data were extracted from two studies comparing a QuickVue® point-of-care test to Reverse-Transcriptase PCR in children with influenza-like-illness. Gordon et al8 studied 989 children in Nicaragua, Harnden et al9 included 157 children in England. Patient characteristics and study procedures were similar, with a low risk of bias (Table 1).

Table 1. Characteristics of the two data sets used for reanalysis.

Study characteristicsHarnden et al. 20039Gordon et al. 20098
Patients
Criteria for inclusion: GPs identified “children with cough and
fever who they thought had more than a
simple cold”
“…fever, or history of fever or
feverishness, and cough and/or sore
throat within five days of symptom
onset…”
Dates: January to March 2001 and October to
March 2002
2008: January 1st to December 31st
Number and setting: 157 children routinely attending primary
care in Oxfordshire
989 Children in a cohort study in
Nicaragua
Age: Six months to 12 years, median 3 yearsTwo to 13 years, mode 3 years
Gender: 100 (63.7%) male576 (49.8%) male
Tests and procedures
Index tests: QuickVue Influenza A+BQuickVue Influenza A+B
Target condition and reference
standard:
Influenza A&B, RT-PCRInfluenza A&B, RT-PCR
Flow and timing: Nasal swab sample by a research nurse,
who undertook POCT immediately,
Nasopharyngeal aspirate from other nostril
for RT-PCR sent to laboratory within four
hours.
Nasal swab for POCT immediately
performed, followed by nasal and throat
swabs for RT-PCR in central laboratory
after storage at 4C for up to 48 hrs.
Risk of bias assessment
Was a consecutive or random
sample of patients enrolled?
Consecutive, from routine clinical practiceRandom selection from 3,935 medical
visits (of 13,666 in cohort) that met the
criteria.
Was a case-control design
avoided?
YesYes
Did the study avoid
inappropriate exclusions?
YesYes
Could the selection of patients
have introduced bias?
Low riskLow risk
Concerns regarding
applicability
Low riskLow risk
Are there concerns that the
included patients and setting
do not match the review
question?
Low concernLow concern
Overall risk of bias: Low riskLow risk
Citation Harnden A, Brueggemann A, Shepperd S,
et al. Near patient testing for influenza in
children in primary care: comparison with
laboratory test. BMJ 2003; 326: 480
Gordon A, Videa E, Saborio S, et al.
Performance of an influenza rapid test in
children in a primary healthcare setting in
Nicaragua. PLoS One 2009; 4: e7907

MCMC chains converged for all estimates. Model fit, assessed by Bayesian p value, was close between observed and expected values, with the exception of cases positive by RT-PCR, but negative by point-of-care test in the English study: 26 were predicted and 34 observed (Bayesian p value 0.081 – close to 0.5 indicates a good fit).

In both populations estimated prevalence was lower with Bayesian analysis: 34.0% (95% credible interval (CrI) 29.6-40.3) Vs 45.3% (95% CI 41.2-49.5) in Nicaragua and 18.6% (95% CrI 12.7-27.6) Vs 38.9% (95% confidence interval (CI) 31.3-47.0) in England (Table 2).

Table 2. Estimates of diagnostic accuracy and influenza prevalence using classical and Bayesian latent class models for the QuickVue® influenza A+B Point-of-care test versus the reverse-transcriptase polymerase chain reaction technique for influenza.

EstimatesClassical model
(95% confidence interval)
Bayesian latent class model
(95% credible interval)
Prevalence
Nicaragua 45.3% (41.2-49.5)34.0% (29.6-40.3)
Oxfordshire 38.9% (31.3-47.0)18.6% (12.7-27.6)
Reverse-transcriptase polymerase chain reaction reference test
Sensitivity Assumed 100% (100-100)98.8% (94.3-100)
Specificity Assumed 100% (100-100)80.1% (75.9-87.0)
Positive predictive value Assumed 100% (100-100)68.4% (62.0-81.0)
Negative predictive value Assumed 100% (100-100)99.3% (96.7-100)
QuickVue® influenza A+B Point-of-care test
Sensitivity 66.9% (61.4-71.9)97.8% (82.1-100)
Specificity 97.8% (95.7-98.9)98.5% (96.5-100)
Positive predictive value 96.0% (92.3-98.0)96.6% (92.0-100)
Negative predictive value 79.0% (75.2-82.4)99.0% (90.7-100)

RT-PCR performance was assumed to be 100% under the classical model, and estimated by Bayesian modelling. Bayesian sensitivity and negative predictive values were close to the assumed values at 98.8% (95% CrI 94.3-100) and 99.3% (95% CrI 96.7-100), but specificity 80.1% (95% CrI 75.9-87.0) and positive predictive value 68.4% (95% CrI 62.0-81.0) were reduced (Table 2).

The performance estimates for the Quick-Vue® point-of-care test were markedly different under Bayesian assumptions. Sensitivity increased from 66.9% (95% CI 61.4-71.9) to 97.8% (95% CrI 82.1-100). Accordingly, the estimates for negative predictive value also increased, from 79.0% (95% CI 75.2-82.4) to 99.0% (95% CrI 90.7-100). Specificity was more similar between models (Classical 97.8%; 95% CI 95.7-98.9 Vs. Bayesian 98.5%; 95% CrI 96.5-100), as was estimated positive predictive value (Classical 96.0%; 95% CI 92.3-98.0 Vs. Bayesian 96.6%; 95% CrI 92.0-100) (Table 2).

Discussion

The classical results for QuickVue® presented here are typical. A systematic review of all point-of-care tests for influenza reported overall specificity of 98.2% (CI, 97.5% to 98.7%), but sensitivity only 62.3% (95% CI, 57.9% to 66.6%)2. In contrast, Bayesian analysis estimated sensitivity of 97.8% (95% CrI 82.1-100), with a negative predictive value of 99.0% (95% CrI 90.7-100), suggesting a test of clinical importance, where there is little room for improvement in the ability to ‘rule out’ infection, apparently answering one of the major criticisms of point-of-care tests3.

The findings suggest false positives by the ‘infallible’ RT-PCR reference test. RT-PCR multiplies nucleic acids exponentially, making it both highly sensitive and vulnerable to false positives. Even the smallest amount of contamination can lead to a false-positive result. This is well recognised, so laboratories often use multiple negative controls10. Gordon et al did not mention negative controls; Harnden et al used water.

A weakness is that the original data were not collected specifically for this analysis; there may therefore be differences between in the study conduct by Gordon et al. and Harnden et al. other than the populations. Despite this, the studies appear to be remarkably similar. The imperfect fit of the data to an element of the Bayesian model should be balanced against classical modelling, where the fit of the data to the ‘perfect’ reference standard is rarely acknowledged, let alone assessed.

Overall, the findings are consistent with higher sensitivity than previously reported, and this underestimation can be attributed to the use of RT-PCR as a ‘gold standard’. These findings have implications for clinical practice, test development, and diagnostic-accuracy studies.

Data availability

Data used in this analysis are from the articles ‘Performance of an influenza rapid test in children in a primary healthcare setting in Nicaragua’8 by Gordon et al. (available at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0007907) and ‘Near patient testing for influenza in children in primary care: comparison with laboratory test’9 by Harnden et al. (available at http://www.bmj.com/content/326/7387/480).

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Jan 2017
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Lee J. Better than we thought? The diagnostic performance of an influenza point-of-care test in children, a Bayesian re-analysis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2017, 6:53 (https://doi.org/10.12688/f1000research.10068.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 18 Jan 2017
Views
12
Cite
Reviewer Report 06 Mar 2017
Benjamin J Cowling, WHO Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong (HKU), Hong Kong, China 
Approved with Reservations
VIEWS 12
This is an interesting re-analysis of published data on rapid test sensitivity, making the case that the rapid test might be more sensitive than previously thought, because of inaccuracy of the gold standard.

Major comment

... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Cowling BJ. Reviewer Report For: Better than we thought? The diagnostic performance of an influenza point-of-care test in children, a Bayesian re-analysis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2017, 6:53 (https://doi.org/10.5256/f1000research.10848.r20688)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
24
Cite
Reviewer Report 10 Feb 2017
Nicolas Tremblay, Unité d'immunovirologie moléculaire, Centre de recherche du Centre hospitalier de l'Université de Montréal, Montreal, QC, Canada 
Not Approved
VIEWS 24
In this report, Lee uses a Bayesian latent class model to estimate the specificity and sensitivity of QuickVue Point-Of-Care-Test for Influenza A & B. The author concludes that, based on his retrospective analysis of two studies (Gordon et al, 2009; Harnden ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Tremblay N. Reviewer Report For: Better than we thought? The diagnostic performance of an influenza point-of-care test in children, a Bayesian re-analysis [version 1; peer review: 1 approved with reservations, 1 not approved]. F1000Research 2017, 6:53 (https://doi.org/10.5256/f1000research.10848.r20109)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Jan 2017
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.