ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Brief Report

Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores

[version 1; peer review: 2 approved with reservations]
PUBLISHED 13 Apr 2022
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

Abstract

Risk prediction models are potentially useful tools for health practitioners and policy makers. When new predictors are proposed to add to existing models, the improvement of discrimination is one of the main measures to assess any increment in performance. In assessing such predictors, we observed two paradoxes: 1) the discriminative ability within all individual risk strata was worse than for the overall population; 2) incremental discrimination after including a new predictor was greater within each individual risk strata than for the whole population. We show two examples of the paradoxes and analyse the possible causes. The key cause of bias is use of the same prediction model as for both stratifying the population, and as the base model to which the new predictor is added.

Keywords

ROC curve, C-statistic, risk prediction models, heart disease risk factors

Introduction

Several new biomarkers, including coronary artery calcium scores (CACS), have been proposed to improve cardiovascular (CVD) risk prediction models, such as the Framingham Risk Score (FRS) or Pooled Cohort Equations (PCE). Their incremental value is usually judged by any improvement discrimination, using measures such as the C-statistic - the area under the receiver operating characteristic curve (AUC) - despite some limitations.1,2

Several recent assessments of CACS added value to CVD risk models were done within CVD risk-strata specific gain but restricting the study population in this way may inflate the apparent gain.

Methods

In the process of a systematic review to assess the incremental value of CACS beyond traditional CVD risk assessment, we identified two studies that report the change in C-statistic from adding CACS within CVD risk strata as well as for the overall cohort.3 We used these two studies to illustrate observed paradoxes, and then explore possible reasons using a simple simulation. All analyses were performed with R Project for Statistical Computing (version3.6.3, RRID: SCR_001905).

Results

Two studies provided sufficient data - the Heinz Nixdorf Recall (HNR) and Multi-Ethnic Study of Atherosclerosis (MESA) studies.4,5 Both compared the C-statistics of base models to C-statistics of extended models (including CACS) in sub-groups defined by CVD risk scores. The apparent increase in C-statistic from adding CACS was greater within every risk sub-group that for the overall cohort (Figure 1) – so all strata gains were above average. There are two paradoxes, the first explaining the second.

de4ef58f-cd13-43c1-bd6e-70abc56ae6ed_figure1.gif

Figure 1. Overall and stratum-specific C-statistics for base and CACS extended model in two studies.3,4

Data of Panel A extracted from Geisel 2017, Table 3; Data of Panel B extracted from Blaha 2021, Figure 2. Both studies compared the C-statistics of base models (FRS in HNR, PCE in MESA) to C-statistics of extended models (including CACS) in sub-groups defined by CVD risk scores. CACS: coronary artery calcium scores.

The first paradox is that the discriminative ability of the CVD risk score within individual CVD risk strata is worse than for the overall population. This surprising “finding” is a statistical artefact: the discriminative ability of a variable will always appear to be less if its range is limited (or within a more homogeneous population), than within the full (more heterogeneous) population.6

The second paradox is the apparent gain in C-statistic for CACS added to the base model is greater within each individual risk strata than for the whole study population. This is not a true “gain”: within each CVD risk stratum the “discrimination” is artificially reduced, and hence the “gain” from CACS artefactually increased. This results in overestimation of the improved discrimination provided by CACS.

Discussion

These two paradoxes related to stratification may seem somewhat surprising but may be more readily understood with other examples. Intelligence quotient (IQ) might be predictive of a young person’s future income level, but any discrimination is weakened by assessment within 10-unit IQ strata. Similarly, blood pressure predicts future stroke, but this prediction is weakened if examined within 10 mmHg bands. This apparent weaker predictive ability is due to the artificial constriction of the predictor and the nature of the discrimination measure.

Figure 2 provides a hypothetical example to help explain these paradoxes. Figure 2A shows 42 people - 21 who have an event and 21 do not - grouped into low, moderate, and high risk according to a risk score. The C-statistic is good for the overall cohort (0.78), but lower in the narrower risk subgroups of 14 people (low risk: 0.61, moderate risk: 0.57, high risk: 0.61), because some of the “discrimination” is already used in separating into these groups. Figure 2B adds a second prognostic factor which “improves” the C-statistic more within each of the risk subgroups (low risk: 0.02, moderate risk: 0.03, high risk: 0.03) than in the overall cohort (0.01).

de4ef58f-cd13-43c1-bd6e-70abc56ae6ed_figure2.gif

Figure 2. A simple illustration of the paradoxes.

Figure 2A shows 21 who had event (red dots) 21 do not (blue dots) - The C-statistic for the overall cohort (0.78) is higher than in any of three risk subgroups (low risk: 0.61, moderate risk: 0.57, high risk: 0.61). Figure 2B – For a second indicator (crosses; Odds Ratio of ~2.0) added to model the C-statistic “improves” more in each of the risk subgroups (low risk: 0.02, moderate risk: 0.03, high risk: 0.03) than in the overall cohort (0.01).

Given the increasing use of risk stratified analyses of prognostic gain, we recommend the incremental discrimination provided by a new biomarker should not be analysed within risk stratified subgroups based on the CVD risk score. Authors, reviewers, and editors should be aware of this flawed analysis and avoid it. More generally, the limitations of discrimination measures1 mean we should consider alternative measures to assess the incremental value of new biomarkers7 and be wary of stratified analyses, particularly when the stratification and the base CVD risk score are the same.

Data availability

All data underlying the results are available as part of the article and no additional source data are required.

Author contributions

L.Z.: Methodology, Software, Writing - Original Draft. K.B.: Methodology, Supervision, Writing - Reviewing and Editing. A.S.: Supervision, Writing - Reviewing and Editing. P.G.: Methodology, Conceptualization, Supervision, Writing - Reviewing and Editing.

Competing interests

No competing interests were disclosed.

Grant information

KB is supported by NHMRC Investigator grant 1174523. PG is supported by NHMRC Australian Fellowship grant 1080042. This study was funded by NHMRC Centre of Research Excellence grant 2006545.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 13 Apr 2022
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Zhu L, Bell KJ, Scott AM and P Glasziou P. Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores [version 1; peer review: 2 approved with reservations]. F1000Research 2022, 11:416 (https://doi.org/10.12688/f1000research.109490.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 13 Apr 2022
Views
8
Cite
Reviewer Report 07 Jun 2024
Abhaya Indrayan, Department of Clinical Research, Max Healthcare Institute, New Delhi, India 
Approved with Reservations
VIEWS 8
Open Peer Review

Paradox in the Values of C-index

Abhaya Indrayan
Max Healthcare Institute, New Delhi

Zhu et al.1 have highlighted an important paradox with the values of C-index when ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Indrayan A. Reviewer Report For: Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores [version 1; peer review: 2 approved with reservations]. F1000Research 2022, 11:416 (https://doi.org/10.5256/f1000research.120996.r284210)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
6
Cite
Reviewer Report 08 Aug 2023
Jonathan D Mosley, Vanderbilt University medical center, Nashville, Tennessee, USA 
Approved with Reservations
VIEWS 6
Zhu et al. have written a brief report highlighting the consequences to c-statistic estimates when risk models are secondarily stratified by the modeled predicted risk. In this case, stratification can lead to inflated c-statistic estimates within strata. This is an ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Mosley JD. Reviewer Report For: Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores [version 1; peer review: 2 approved with reservations]. F1000Research 2022, 11:416 (https://doi.org/10.5256/f1000research.120996.r192871)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 13 Apr 2022
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.