Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores

Lin Zhu; Katy JL Bell; Anna Mae Scott; Paul P Glasziou

doi:10.12688/f1000research.109490.1

Home Browse Analyses within risk strata overestimate gain in discrimination: the...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Brief Report

Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores

[version 1; peer review: 2 approved with reservations]

Lin Zhu ¹, Katy JL Bell¹, Anna Mae Scott², Paul P Glasziou²

PUBLISHED 13 Apr 2022

Author details Author details

¹ School of Public Health, University of Sydney, Sydney, NSW, 2008, Australia
² Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Queensland, 4229, Australia

Lin Zhu
Roles: Formal Analysis, Methodology, Visualization, Writing – Original Draft Preparation

Katy JL Bell
Roles: Methodology, Supervision, Writing – Review & Editing

Anna Mae Scott
Roles: Methodology, Supervision, Writing – Review & Editing

Paul P Glasziou
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

Risk prediction models are potentially useful tools for health practitioners and policy makers. When new predictors are proposed to add to existing models, the improvement of discrimination is one of the main measures to assess any increment in performance. In assessing such predictors, we observed two paradoxes: 1) the discriminative ability within all individual risk strata was worse than for the overall population; 2) incremental discrimination after including a new predictor was greater within each individual risk strata than for the whole population. We show two examples of the paradoxes and analyse the possible causes. The key cause of bias is use of the same prediction model as for both stratifying the population, and as the base model to which the new predictor is added.

Keywords

ROC curve, C-statistic, risk prediction models, heart disease risk factors

Corresponding author: Lin Zhu

Competing interests: No competing interests were disclosed.

Grant information: KB is supported by NHMRC Investigator grant 1174523. PG is supported by NHMRC Australian Fellowship grant 1080042. This study was funded by NHMRC Centre of Research Excellence grant 2006545.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2022 Zhu L et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Zhu L, Bell KJ, Scott AM and P Glasziou P. Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores [version 1; peer review: 2 approved with reservations]. F1000Research 2022, 11:416 (https://doi.org/10.12688/f1000research.109490.1) First published: 13 Apr 2022, 11:416 (https://doi.org/10.12688/f1000research.109490.1) Latest published: 13 Apr 2022, 11:416 (https://doi.org/10.12688/f1000research.109490.1)

Introduction

Several new biomarkers, including coronary artery calcium scores (CACS), have been proposed to improve cardiovascular (CVD) risk prediction models, such as the Framingham Risk Score (FRS) or Pooled Cohort Equations (PCE). Their incremental value is usually judged by any improvement discrimination, using measures such as the C-statistic - the area under the receiver operating characteristic curve (AUC) - despite some limitations.¹^,²

Several recent assessments of CACS added value to CVD risk models were done within CVD risk-strata specific gain but restricting the study population in this way may inflate the apparent gain.

Methods

In the process of a systematic review to assess the incremental value of CACS beyond traditional CVD risk assessment, we identified two studies that report the change in C-statistic from adding CACS within CVD risk strata as well as for the overall cohort.³ We used these two studies to illustrate observed paradoxes, and then explore possible reasons using a simple simulation. All analyses were performed with R Project for Statistical Computing (version3.6.3, RRID: SCR_001905).

Results

Two studies provided sufficient data - the Heinz Nixdorf Recall (HNR) and Multi-Ethnic Study of Atherosclerosis (MESA) studies.⁴^,⁵ Both compared the C-statistics of base models to C-statistics of extended models (including CACS) in sub-groups defined by CVD risk scores. The apparent increase in C-statistic from adding CACS was greater within every risk sub-group that for the overall cohort (Figure 1) – so all strata gains were above average. There are two paradoxes, the first explaining the second.

Figure 1. Overall and stratum-specific C-statistics for base and CACS extended model in two studies.³^,⁴

Data of Panel A extracted from Geisel 2017, Table 3; Data of Panel B extracted from Blaha 2021, Figure 2. Both studies compared the C-statistics of base models (FRS in HNR, PCE in MESA) to C-statistics of extended models (including CACS) in sub-groups defined by CVD risk scores. CACS: coronary artery calcium scores.

The first paradox is that the discriminative ability of the CVD risk score within individual CVD risk strata is worse than for the overall population. This surprising “finding” is a statistical artefact: the discriminative ability of a variable will always appear to be less if its range is limited (or within a more homogeneous population), than within the full (more heterogeneous) population.⁶

The second paradox is the apparent gain in C-statistic for CACS added to the base model is greater within each individual risk strata than for the whole study population. This is not a true “gain”: within each CVD risk stratum the “discrimination” is artificially reduced, and hence the “gain” from CACS artefactually increased. This results in overestimation of the improved discrimination provided by CACS.

Discussion

These two paradoxes related to stratification may seem somewhat surprising but may be more readily understood with other examples. Intelligence quotient (IQ) might be predictive of a young person’s future income level, but any discrimination is weakened by assessment within 10-unit IQ strata. Similarly, blood pressure predicts future stroke, but this prediction is weakened if examined within 10 mmHg bands. This apparent weaker predictive ability is due to the artificial constriction of the predictor and the nature of the discrimination measure.

Figure 2 provides a hypothetical example to help explain these paradoxes. Figure 2A shows 42 people - 21 who have an event and 21 do not - grouped into low, moderate, and high risk according to a risk score. The C-statistic is good for the overall cohort (0.78), but lower in the narrower risk subgroups of 14 people (low risk: 0.61, moderate risk: 0.57, high risk: 0.61), because some of the “discrimination” is already used in separating into these groups. Figure 2B adds a second prognostic factor which “improves” the C-statistic more within each of the risk subgroups (low risk: 0.02, moderate risk: 0.03, high risk: 0.03) than in the overall cohort (0.01).

Figure 2. A simple illustration of the paradoxes.

Figure 2A shows 21 who had event (red dots) 21 do not (blue dots) - The C-statistic for the overall cohort (0.78) is higher than in any of three risk subgroups (low risk: 0.61, moderate risk: 0.57, high risk: 0.61). Figure 2B – For a second indicator (crosses; Odds Ratio of ~2.0) added to model the C-statistic “improves” more in each of the risk subgroups (low risk: 0.02, moderate risk: 0.03, high risk: 0.03) than in the overall cohort (0.01).

Given the increasing use of risk stratified analyses of prognostic gain, we recommend the incremental discrimination provided by a new biomarker should not be analysed within risk stratified subgroups based on the CVD risk score. Authors, reviewers, and editors should be aware of this flawed analysis and avoid it. More generally, the limitations of discrimination measures¹ mean we should consider alternative measures to assess the incremental value of new biomarkers⁷ and be wary of stratified analyses, particularly when the stratification and the base CVD risk score are the same.

Data availability

All data underlying the results are available as part of the article and no additional source data are required.

Author contributions

L.Z.: Methodology, Software, Writing - Original Draft. K.B.: Methodology, Supervision, Writing - Reviewing and Editing. A.S.: Supervision, Writing - Reviewing and Editing. P.G.: Methodology, Conceptualization, Supervision, Writing - Reviewing and Editing.

Competing interests

No competing interests were disclosed.

Grant information

KB is supported by NHMRC Investigator grant 1174523. PG is supported by NHMRC Australian Fellowship grant 1080042. This study was funded by NHMRC Centre of Research Excellence grant 2006545.

Acknowledgments

NHMRC had no role in study design, data analysis, decision to publish, or preparation of the manuscript.

References

1. Cook NR: Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction. Circulation. 2007; 115: 928–935. PubMed Abstract | Publisher Full Text
2. Ware JH: The Limitations of Risk Factors as Prognostic Tools. N. Engl. J. Med. 2006; 355: 2615–2617. Publisher Full Text
3. Bell KJL, White S, Hassan O, et al.: Evaluation of the Incremental Value of a Coronary Artery Calcium Score Beyond Traditional Cardiovascular Risk Assessment: A Systematic Review and Meta-analysis. JAMA Intern. Med. 2022. (In process).
4. Geisel MH, Bauer M, Hennig F, et al.: Comparison of coronary artery calcification, carotid intima-media thickness and ankle-brachial index for predicting 10-year incident cardiovascular events in the general population. Eur. Heart J. 2017; 38: 1815–1822. PubMed Abstract | Publisher Full Text
5. Blaha MJ, Whelton SP, Al Rifai M, et al.: Comparing Risk Scores in the Prediction of Coronary and Cardiovascular Deaths. JACC Cardiovasc. Imaging. 2021; 14: 411–421. PubMed Abstract | Publisher Full Text
6. Debray TPA, Damen JAAG, Riley RD, et al.: A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes. Stat. Methods Med. Res. 2019; 28: 2768–2786. PubMed Abstract | Publisher Full Text
7. Pencina MJ, D’Agostino RB, D’Agostino RB, et al.: Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat. Med. 2008; 27: 157–172. PubMed Abstract | Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 13 Apr 2022

Author details Author details

¹ School of Public Health, University of Sydney, Sydney, NSW, 2008, Australia
² Institute for Evidence-Based Healthcare, Bond University, Gold Coast, Queensland, 4229, Australia

Lin Zhu
Roles: Formal Analysis, Methodology, Visualization, Writing – Original Draft Preparation

Katy JL Bell
Roles: Methodology, Supervision, Writing – Review & Editing

Anna Mae Scott
Roles: Methodology, Supervision, Writing – Review & Editing

Paul P Glasziou
Roles: Conceptualization, Methodology, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

KB is supported by NHMRC Investigator grant 1174523. PG is supported by NHMRC Australian Fellowship grant 1080042. This study was funded by NHMRC Centre of Research Excellence grant 2006545.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (1)

version 1

Published: 13 Apr 2022, 11:416

https://doi.org/10.12688/f1000research.109490.1

Copyright

© 2022 Zhu L et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Zhu L, Bell KJ, Scott AM and P Glasziou P. Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores [version 1; peer review: 2 approved with reservations]. F1000Research 2022, 11:416 (https://doi.org/10.12688/f1000research.109490.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 13 Apr 2022

Views

11

Reviewer Report 07 Jun 2024

Abhaya Indrayan, Department of Clinical Research, Max Healthcare Institute, New Delhi, India

Approved with Reservations

https://doi.org/10.5256/f1000research.120996.r284210

Open Peer Review

Paradox in the Values of C-index

Abhaya Indrayan
Max Healthcare Institute, New Delhi

Zhu et al.¹ have highlighted an important paradox with the values of C-index when ... Continue reading

Open Peer Review

Paradox in the Values of C-index

Abhaya Indrayan
Max Healthcare Institute, New Delhi

Zhu et al.¹ have highlighted an important paradox with the values of C-index when the aggregate picture is not the sum of its parts. They demonstrated this with the help of two real-life examples. In both, the gain in discrimination measured by the change in C-statistic is smaller in aggregate but larger in each of the risk-score stratum when the CAC Extended Model is compared with the Base Model in Heinz Nixdorf Recall study² and Multi-Ethic Study of Atherosclerosis³. The second anomaly they pointed out is that the C-index is higher for aggregate than for each risk score stratum.

Good to see such anomalies have been detected and highlighted. These anomalies add to the one earlier discussed by Cook⁴ inadequacy of C-index as a measure of discrimination that it compromises the contribution of individual biomarkers when used for assessing the performance of a model. In addition, the one recently discussed by Indrayan et al⁵ is the inadequacy of C-index for assessing predictivity as this index is based on sensitivity and specificity which are measures of discrimination and classification of the known outcomes and not the prediction of the unknown outcomes. In the present paper also, Zhu et al. have used the term ‘predictive ability’, which is inappropriate for C-index.

Whereas the authors have successfully demonstrated the paradox, the explanation provided by them requires more elaboration. They attribute this paradox to the homogeneity of values within each stratum which may have ‘artificially’ reduced the discrimination ability. A more detailed explanation would have helped the readers to appreciate this paradox. For example, it is not mentioned that this would always happen and whether the anomaly increases with the decrease in the stratum size. Secondly, it would have been helpful if properly framed advice was included for the researchers regarding explaining or getting over this paradox. They advise that incremental discrimination by an added marker should avoid the analyses within subgroups. This advice may not be applicable when subgroup analysis is required in a clinical context. My suggestion is that subgroups may be analysed where needed but the anomaly be fully explained. It would have been more useful to the readers when the alternatives meausres⁶ they suggest were fully explained regarding how they can be used to study or explain the paradox.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

References

1. Zhu L, Bell K, Scott A, Glasziou P: Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores. F1000Research. 2022; 11. Publisher Full Text
2. Geisel MH, Bauer M, Hennig F, Hoffmann B, et al.: Comparison of coronary artery calcification, carotid intima-media thickness and ankle-brachial index for predicting 10-year incident cardiovascular events in the general population.Eur Heart J. 2017; 38 (23): 1815-1822 PubMed Abstract | Publisher Full Text
3. Blaha MJ, Whelton SP, Al Rifai M, Dardari Z, et al.: Comparing Risk Scores in the Prediction of Coronary and Cardiovascular Deaths: Coronary Artery Calcium Consortium.JACC Cardiovasc Imaging. 2021; 14 (2): 411-421 PubMed Abstract | Publisher Full Text
4. Cook NR: Use and misuse of the receiver operating characteristic curve in risk prediction.Circulation. 2007; 115 (7): 928-35 PubMed Abstract | Publisher Full Text
5. Indrayan A, Malhotra RK, Pawar M: Use of ROC curve analysis for prediction gives fallacious results: Use predictivity-based indices.J Postgrad Med. 2024; 70 (2): 91-96 PubMed Abstract | Publisher Full Text
6. Pencina MJ, D'Agostino RB, D'Agostino RB, Vasan RS: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27 (2): 157-72; discussion 207 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Medical Biostatistics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

7

Reviewer Report 08 Aug 2023

Jonathan D Mosley, Vanderbilt University medical center, Nashville, Tennessee, USA

Approved with Reservations

https://doi.org/10.5256/f1000research.120996.r192871

Zhu et al. have written a brief report highlighting the consequences to c-statistic estimates when risk models are secondarily stratified by the modeled predicted risk. In this case, stratification can lead to inflated c-statistic estimates within strata. This is an ... Continue reading

Zhu et al. have written a brief report highlighting the consequences to c-statistic estimates when risk models are secondarily stratified by the modeled predicted risk. In this case, stratification can lead to inflated c-statistic estimates within strata. This is an important artifact to highlight, as it is common to see post-hoc stratification in risk prediction studies to identify subgroups where a biomarker is purported have a greater benefit. While the overall message is clear and well-written, there are some details missing in the methods, particularly with respect to the simulation. It would also be useful to highlight other scenarios where the issue would manifest.

My specific comments are:

Methods: The authors should clarify whether the data from figure 1 are re-calculated estimates or whether these were the original data presented in the referenced papers. How was the c-statistic calculated (either in the original papers of in this study).
Methods: More details are needed regarding the simulation. It is not clear what was done. Does figure 2 show the results of the simulation study? or is this just an illustrative figure?
Discussion: Will the c-statistics always be inflated with stratification? Are there scenarios where it could go down?
Discussion: It is common to see post-hoc stratification performed using one variable that is a component of a prediction model. For instance, this is often seen with stratification by age (which is a component of most prediction models) to show that a biomarker has greater effect in younger individuals. Does this scenario lead to the same inflated c-statistic estimates?

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Genetic epidemiology.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 13 Apr 2022

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 1 13 Apr 22	read	read

Jonathan D Mosley, Vanderbilt University medical center, Nashville, USA
Abhaya Indrayan, Max Healthcare Institute, New Delhi, India

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

11 Views

07 Jun 2024 | for Version 1

Abhaya Indrayan, Department of Clinical Research, Max Healthcare Institute, New Delhi, India

11 Views Cite this report Responses(0)

Approved With Reservations

Open Peer Review

Paradox in the Values of C-index

Abhaya Indrayan
Max Healthcare Institute, New Delhi

Zhu et al.¹ have highlighted an important paradox with the values of C-index when the aggregate picture is not the sum of its parts. They demonstrated this with the help of two real-life examples. In both, the gain in discrimination measured by the change in C-statistic is smaller in aggregate but larger in each of the risk-score stratum when the CAC Extended Model is compared with the Base Model in Heinz Nixdorf Recall study² and Multi-Ethic Study of Atherosclerosis³. The second anomaly they pointed out is that the C-index is higher for aggregate than for each risk score stratum.

Good to see such anomalies have been detected and highlighted. These anomalies add to the one earlier discussed by Cook⁴ inadequacy of C-index as a measure of discrimination that it compromises the contribution of individual biomarkers when used for assessing the performance of a model. In addition, the one recently discussed by Indrayan et al⁵ is the inadequacy of C-index for assessing predictivity as this index is based on sensitivity and specificity which are measures of discrimination and classification of the known outcomes and not the prediction of the unknown outcomes. In the present paper also, Zhu et al. have used the term ‘predictive ability’, which is inappropriate for C-index.

Whereas the authors have successfully demonstrated the paradox, the explanation provided by them requires more elaboration. They attribute this paradox to the homogeneity of values within each stratum which may have ‘artificially’ reduced the discrimination ability. A more detailed explanation would have helped the readers to appreciate this paradox. For example, it is not mentioned that this would always happen and whether the anomaly increases with the decrease in the stratum size. Secondly, it would have been helpful if properly framed advice was included for the researchers regarding explaining or getting over this paradox. They advise that incremental discrimination by an added marker should avoid the analyses within subgroups. This advice may not be applicable when subgroup analysis is required in a clinical context. My suggestion is that subgroups may be analysed where needed but the anomaly be fully explained. It would have been more useful to the readers when the alternatives meausres⁶ they suggest were fully explained regarding how they can be used to study or explain the paradox.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

References

1. Zhu L, Bell K, Scott A, Glasziou P: Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores. F1000Research. 2022; 11. Publisher Full Text
2. Geisel MH, Bauer M, Hennig F, Hoffmann B, et al.: Comparison of coronary artery calcification, carotid intima-media thickness and ankle-brachial index for predicting 10-year incident cardiovascular events in the general population.Eur Heart J. 2017; 38 (23): 1815-1822 PubMed Abstract | Publisher Full Text
3. Blaha MJ, Whelton SP, Al Rifai M, Dardari Z, et al.: Comparing Risk Scores in the Prediction of Coronary and Cardiovascular Deaths: Coronary Artery Calcium Consortium.JACC Cardiovasc Imaging. 2021; 14 (2): 411-421 PubMed Abstract | Publisher Full Text
4. Cook NR: Use and misuse of the receiver operating characteristic curve in risk prediction.Circulation. 2007; 115 (7): 928-35 PubMed Abstract | Publisher Full Text
5. Indrayan A, Malhotra RK, Pawar M: Use of ROC curve analysis for prediction gives fallacious results: Use predictivity-based indices.J Postgrad Med. 2024; 70 (2): 91-96 PubMed Abstract | Publisher Full Text
6. Pencina MJ, D'Agostino RB, D'Agostino RB, Vasan RS: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond.Stat Med. 2008; 27 (2): 157-72; discussion 207 PubMed Abstract | Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Medical Biostatistics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

7 Views

08 Aug 2023 | for Version 1

Jonathan D Mosley, Vanderbilt University medical center, Nashville, Tennessee, USA

7 Views Cite this report Responses(0)

Approved With Reservations

Zhu et al. have written a brief report highlighting the consequences to c-statistic estimates when risk models are secondarily stratified by the modeled predicted risk. In this case, stratification can lead to inflated c-statistic estimates within strata. This is an important artifact to highlight, as it is common to see post-hoc stratification in risk prediction studies to identify subgroups where a biomarker is purported have a greater benefit. While the overall message is clear and well-written, there are some details missing in the methods, particularly with respect to the simulation. It would also be useful to highlight other scenarios where the issue would manifest.

My specific comments are:

Methods: The authors should clarify whether the data from figure 1 are re-calculated estimates or whether these were the original data presented in the referenced papers. How was the c-statistic calculated (either in the original papers of in this study).
Methods: More details are needed regarding the simulation. It is not clear what was done. Does figure 2 show the results of the simulation study? or is this just an illustrative figure?
Discussion: Will the c-statistics always be inflated with stratification? Are there scenarios where it could go down?
Discussion: It is common to see post-hoc stratification performed using one variable that is a component of a prediction model. For instance, this is often seen with stratification by age (which is a component of most prediction models) to show that a biomarker has greater effect in younger individuals. Does this scenario lead to the same inflated c-statistic estimates?

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Genetic epidemiology.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

[1] 1. Cook NR: Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction. Circulation. 2007; 115: 928–935. PubMed Abstract | Publisher Full Text

[2] 2. Ware JH: The Limitations of Risk Factors as Prognostic Tools. N. Engl. J. Med. 2006; 355: 2615–2617. Publisher Full Text

[3] 3. Bell KJL, White S, Hassan O, et al.: Evaluation of the Incremental Value of a Coronary Artery Calcium Score Beyond Traditional Cardiovascular Risk Assessment: A Systematic Review and Meta-analysis. JAMA Intern. Med. 2022. (In process).

[4] 4. Geisel MH, Bauer M, Hennig F, et al.: Comparison of coronary artery calcification, carotid intima-media thickness and ankle-brachial index for predicting 10-year incident cardiovascular events in the general population. Eur. Heart J. 2017; 38: 1815–1822. PubMed Abstract | Publisher Full Text

[5] 5. Blaha MJ, Whelton SP, Al Rifai M, et al.: Comparing Risk Scores in the Prediction of Coronary and Cardiovascular Deaths. JACC Cardiovasc. Imaging. 2021; 14: 411–421. PubMed Abstract | Publisher Full Text

[6] 6. Debray TPA, Damen JAAG, Riley RD, et al.: A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes. Stat. Methods Med. Res. 2019; 28: 2768–2786. PubMed Abstract | Publisher Full Text

[7] 7. Pencina MJ, D’Agostino RB, D’Agostino RB, et al.: Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Stat. Med. 2008; 27: 157–172. PubMed Abstract | Publisher Full Text

Analyses within risk strata overestimate gain in discrimination: the example of coronary artery calcium scores

Abstract

Keywords

Introduction

Methods

Results

Figure 1. Overall and stratum-specific C-statistics for base and CACS extended model in two studies.3,4

Discussion

Figure 2. A simple illustration of the paradoxes.

Data availability

Author contributions

Competing interests

Grant information

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated

Figure 1. Overall and stratum-specific C-statistics for base and CACS extended model in two studies.³^,⁴