<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/publishing/1.2/JATS-journalpublishing1.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2" xml:lang="en">
    <front>
        <journal-meta>
            <journal-id journal-id-type="pmc">F1000Research</journal-id>
            <journal-title-group>
                <journal-title>F1000Research</journal-title>
            </journal-title-group>
            <issn pub-type="epub">2046-1402</issn>
            <publisher>
                <publisher-name>F1000 Research Limited</publisher-name>
                <publisher-loc>London, UK</publisher-loc>
            </publisher>
        </journal-meta>
        <article-meta>
            <article-id pub-id-type="doi">10.12688/f1000research.179913.2</article-id>
            <article-categories>
                <subj-group subj-group-type="heading">
                    <subject>Research Article</subject>
                </subj-group>
                <subj-group>
                    <subject>Articles</subject>
                </subj-group>
            </article-categories>
            <title-group>
                <article-title>Web-based Machine Learning Model for Predicting Chronic Kidney Disease in Patients with Type 2 Diabetes Mellitus: A Multicenter Study</article-title>
                <fn-group content-type="pub-status">
                    <fn>
                        <p>[version 2; peer review: 1 approved, 1 approved with reservations]</p>
                    </fn>
                </fn-group>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Kresnowati</surname>
                        <given-names>Lily</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="corresp" rid="c1">a</xref>
                    <xref ref-type="aff" rid="a1">1</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Suhartono</surname>
                        <given-names>Suhartono</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a2">2</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Shaluhiyah</surname>
                        <given-names>Zahroh</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0003-2663-7918</uri>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="no">
                    <name>
                        <surname>Widjanarko</surname>
                        <given-names>Bagoes</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <xref ref-type="aff" rid="a3">3</xref>
                </contrib>
                <contrib contrib-type="author" corresp="yes">
                    <name>
                        <surname>Hasan</surname>
                        <given-names>Faizul</given-names>
                    </name>
                    <role content-type="http://credit.niso.org/">Conceptualization</role>
                    <role content-type="http://credit.niso.org/">Data Curation</role>
                    <role content-type="http://credit.niso.org/">Formal Analysis</role>
                    <role content-type="http://credit.niso.org/">Funding Acquisition</role>
                    <role content-type="http://credit.niso.org/">Investigation</role>
                    <role content-type="http://credit.niso.org/">Methodology</role>
                    <role content-type="http://credit.niso.org/">Project Administration</role>
                    <role content-type="http://credit.niso.org/">Resources</role>
                    <role content-type="http://credit.niso.org/">Software</role>
                    <role content-type="http://credit.niso.org/">Supervision</role>
                    <role content-type="http://credit.niso.org/">Validation</role>
                    <role content-type="http://credit.niso.org/">Visualization</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Original Draft Preparation</role>
                    <role content-type="http://credit.niso.org/">Writing &#x2013; Review &amp; Editing</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-7802-1328</uri>
                    <xref ref-type="corresp" rid="c2">b</xref>
                    <xref ref-type="aff" rid="a4">4</xref>
                </contrib>
                <aff id="a1">
                    <label>1</label>Doctoral Program of Public Health, Faculty of Public Health, Universitas Diponegoro, Semarang, Central Java, Indonesia</aff>
                <aff id="a2">
                    <label>2</label>Department of Environmental Health, Diponegoro University School of Public Health, Semarang, Central Java, Indonesia</aff>
                <aff id="a3">
                    <label>3</label>Department of Health Promotion and Behavioral Science, Faculty of Public Health, Universitas Diponegoro, Semarang, Central Java, Indonesia</aff>
                <aff id="a4">
                    <label>4</label>Faculty of Nursing, Chulalongkorn University, Bangkok, Bangkok, Thailand</aff>
            </contrib-group>
            <author-notes>
                <corresp id="c1">
                    <label>a</label>
                    <email xlink:href="mailto:lilykresnowati@undip.ac.id">lilykresnowati@undip.ac.id</email>
                </corresp>
                <corresp id="c2">
                    <label>b</label>
                    <email xlink:href="mailto:faizul.h@chula.ac.th">faizul.h@chula.ac.th</email>
                </corresp>
                <fn fn-type="conflict">
                    <p>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>10</day>
                <month>6</month>
                <year>2026</year>
            </pub-date>
            <pub-date pub-type="collection">
                <year>2026</year>
            </pub-date>
            <volume>15</volume>
            <elocation-id>690</elocation-id>
            <history>
                <date date-type="accepted">
                    <day>3</day>
                    <month>6</month>
                    <year>2026</year>
                </date>
            </history>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Kresnowati L et al.</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <self-uri content-type="pdf" xlink:href="https://f1000research.com/articles/15-690/pdf"/>
            <abstract>
                <sec>
                    <title>Background</title>
                    <p>Chronic kidney disease (CKD) is a serious complication of type 2 diabetes (T2DM), particularly in low- and middle-income countries with limited access to early diagnosis. Predicting CKD risk using routine clinical data could enable earlier nephroprotective care. This study developed and internally validated a machine learning-based web application to predict incident CKD among T2DM patients in Indonesia&#x2019;s national health insurance program (Prolanis).</p>
                </sec>
                <sec>
                    <title>Methods</title>
                    <p>A machine learning prediction model was conducted using BPJS Prolanis data (2017&#x2013;2023). Adults (&#x2265;18&#x00a0;years) with T2DM and no prior CKD were included. Six algorithms (Logistic Regression, Random Forest, Decision Tree, XGBoost, LightGBM, CatBoost) were trained on 80% of the data and internally validated on the remaining 20% to predict CKD. Performance was assessed via accuracy, precision, recall, F1 score, and AUC. SHAP was used for interpretability.</p>
                </sec>
                <sec>
                    <title>Results</title>
                    <p>Among 7,581 individuals, 864 (11.4%) developed CKD. CatBoost achieved the best performance (AUC&#x00a0;=&#x00a0;0.847, accuracy&#x00a0;=&#x00a0;0.797, precision&#x00a0;=&#x00a0;0.643, recall&#x00a0;=&#x00a0;0.525, F1&#x00a0;=&#x00a0;0.578). SHAP identified rapid-acting insulin analogues, amlodipine, furosemide, high blood urea nitrogen, and folic acid as key positive predictors. Advanced age and higher comorbidity burden increased risk, while chronic ischaemic heart disease and dental pulp diseases appeared protective&#x2014;likely due to healthcare utilization bias. A web-based risk calculator was developed.</p>
                </sec>
                <sec>
                    <title>Conclusions</title>
                    <p>The CatBoost-based web app demonstrated strong discriminative ability for predicting incident CKD in T2DM patients using routine claims data. This tool may support risk stratification in primary care settings across Indonesia and similar low-resource environments.</p>
                </sec>
            </abstract>
            <kwd-group kwd-group-type="author">
                <kwd>chronic kidney disease</kwd>
                <kwd>type 2 diabetes mellitus</kwd>
                <kwd>machine learning</kwd>
                <kwd>prediction model</kwd>
                <kwd>web-based calculator.</kwd>
            </kwd-group>
            <funding-group>
                <funding-statement>The author(s) declared that no grants were involved in supporting this work.</funding-statement>
            </funding-group>
        </article-meta>
        <notes>
            <sec sec-type="version-changes">
                <label>Revised</label>
                <title>Amendments from Version 1</title>
                <p>This updated version of the manuscript incorporates essential structural and reporting enhancements to fully comply with TRIPOD statement criteria and peer-review recommendations. Table 1 has been revised to stratify baseline clinical data alongside corresponding $p$-values, a new participant flow diagram has been incorporated as Figure 1, and all performance indicators have been standardised to three decimal places. The content has been thoroughly checked for formatting consistency, and the conclusion has been updated to explicitly present the web-based calculator as a preliminary screening tool poised for potential incorporation into Indonesia's national BPJS P-Care platform.</p>
            </sec>
        </notes>
    </front>
    <body>
        <sec id="sec5" sec-type="intro">
            <title>Introduction</title>
            <p>Chronic kidney disease (CKD) constitutes a significant global public health challenge, with particularly concerning trends observed in low- and middle-income nations.
                <sup>
                    <xref ref-type="bibr" rid="ref1">1</xref>,
                    <xref ref-type="bibr" rid="ref2">2</xref>
                </sup> Characterised by a gradual decline in renal function, CKD significantly elevates the risks of cardiovascular incidents, end-stage renal disease (ESRD), hospitalisation, and early mortality.
                <sup>
                    <xref ref-type="bibr" rid="ref3">3</xref>
                </sup> Approximately 10% of persons globally suffer from CKD, with diabetes mellitus and hypertension responsible for more than half of the cases.
                <sup>
                    <xref ref-type="bibr" rid="ref4">4</xref>
                </sup> The prevalence of CKD in Indonesia has consistently increased over the last decade, primarily due to the rising incidence of type 2 diabetes mellitus (T2DM), which currently impacts around 10.7 million adults nationwide.
                <sup>
                    <xref ref-type="bibr" rid="ref5">5</xref>,
                    <xref ref-type="bibr" rid="ref6">6</xref>
                </sup> The economic and societal burdens are significant:
                <sup>
                    <xref ref-type="bibr" rid="ref7">7</xref>,
                    <xref ref-type="bibr" rid="ref8">8</xref>
                </sup> chronic kidney disease necessitates continuous care, regular monitoring, and, in advanced stages, expensive renal replacement therapy.</p>
            <p>Individuals with T2DM are at an elevated risk of developing CKD, referred to as diabetic kidney disease (DKD). Between 20% and 40% of patients with T2DM will develop DKD during their lifetime, establishing diabetes as the predominant cause of End-Stage Renal Disease in numerous countries.
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref12">12</xref>
                </sup> The etiology of DKD include intricate connections among hyperglycemia, haemodynamic changes, inflammation, and genetic predisposition.
                <sup>
                    <xref ref-type="bibr" rid="ref13">13</xref>,
                    <xref ref-type="bibr" rid="ref14">14</xref>
                </sup> Hyperglycemia triggers glomerular hyperfiltration, oxidative stress, and the buildup of advanced glycation end-products, ultimately leading to glomerulosclerosis, tubulointerstitial fibrosis, and progressive nephron loss.
                <sup>
                    <xref ref-type="bibr" rid="ref14">14</xref>,
                    <xref ref-type="bibr" rid="ref15">15</xref>
                </sup> Well-defined risk factors for DKD encompass advanced age, male gender, prolonged diabetes duration, higher haemoglobin A1c levels, hypertension, dyslipidaemia, obesity, and tobacco use.
                <sup>
                    <xref ref-type="bibr" rid="ref9">9</xref>,
                    <xref ref-type="bibr" rid="ref16">16</xref>
                </sup> Timely diagnosis of T2DM patients at increased CKD risk is essential for the initiation of effective nephroprotective strategies: stringent glycaemic management, blood pressure control through RAAS inhibition, and lifestyle modifications.</p>
            <p>Conventional DKD risk prediction techniques predominantly employ logistic regression or Cox proportional hazards, utilising a restricted set of predetermined factors. Despite achieving moderate performance (AUC values generally between 0.70 and 0.80), these models exhibit significant limitations: they presuppose linear relationships between predictors and outcomes, failing to adequately represent the intricate biology of DKD; they frequently necessitate comprehensive data on all predictors, which is often inaccessible in resource-limited environments; and many are developed from specific clinical trial populations or small, single-center cohorts, thereby raising concerns regarding their generalisability.
                <sup>
                    <xref ref-type="bibr" rid="ref17">17</xref>,
                    <xref ref-type="bibr" rid="ref18">18</xref>
                </sup>
            </p>
            <p>Machine learning (ML) provides a robust alternative for predicting clinical risks. Ensemble tree-based methodologies&#x2014;specifically CatBoost, XGBoost, and LightGBM&#x2014;exhibit considerable promise owing to their resilience against outliers, capacity to manage heterogeneous data types, and intrinsic mechanisms for addressing missing values, frequently surpassing logistic regression with AUCs ranging from 0.80 to 0.90.
                <sup>
                    <xref ref-type="bibr" rid="ref19">19</xref>,
                    <xref ref-type="bibr" rid="ref20">20</xref>
                </sup> Nonetheless, significant deficiencies persist: the majority of ML research originates from high-income nations, with scant contributions from Southeast Asia, particularly Indonesia; numerous models depend on laboratory metrics that are not easily accessible in primary care settings; practical clinical application featuring user-friendly web interfaces is restricted; and model interpretability has only recently been tackled via SHAP analysis.
                <sup>
                    <xref ref-type="bibr" rid="ref21">21</xref>
                </sup>
            </p>
            <p>Indonesia&#x2019;s national health insurance system (BPJS Kesehatan) encompasses roughly 80% of the populace and administers Prolanis, a chronic disease management initiative that methodically enrols T2DM patients and produces extensive real-world clinical data. Nonetheless, no machine learning-based instrument has been specially designed for the Indonesian T2DM population to forecast incident chronic kidney disease utilising frequently gathered Prolanis data. This study sought to create and internally validate a machine learning predictive model for incident chronic kidney disease in patients with T2DM enrolled in BPJS Indonesia&#x2019;s national health insurance chronic disease management program (Prolanis), utilising routinely gathered demographic, clinical, and medication data. The study aimed to improve interpretability through SHAP analysis and to offer a practical web-based calculator similar to previously published AI risk assessment tools.
                <sup>
                    <xref ref-type="bibr" rid="ref22">22</xref>
                </sup>
            </p>
        </sec>
        <sec id="sec6" sec-type="methods">
            <title>Material and methods</title>
            <sec id="sec7">
                <title>Study design and data source</title>
                <p>We predictive ML study utilising a sample dataset from Indonesia&#x2019;s Badan Penyelenggara Jaminan Sosial (BPJS) Chronic Disease Management Program, referred to as Prolanis. The collection included anonymised patient data gathered from January 1, 2017, to December 31, 2023. The BPJS Prolanis database comprises organised data on patient demographics, clinical diagnoses (classified according to the International Classification of Diseases, 10th Revision [ICD-10]), medication prescriptions (classified using the Anatomical Therapeutic Chemical [ATC] system), laboratory results, and outpatient visit records. The research adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) standards.</p>
            </sec>
            <sec id="sec8">
                <title>Study population</title>
                <p>The trial cohort consisted of patients diagnosed with T2DM who engaged in the Prolanis program. Adults (&#x2265;18&#x00a0;years) were included if they had a minimum of two documented outpatient visits during the study period. The index date was established as the date of the initial T2DM diagnosis documented within the observation period.</p>
                <p>Patients were excluded if they had a previous diagnosis of CKD at the start of their monitoring period. CKD was categorised with ICD-10 codes N18.1&#x2013;N18.9 (chronic kidney disease stages 1&#x2013;5, unspecified), alongside Z49 (dialysis care) and Z99.2 (dependency on renal dialysis). Furthermore, patients with absent outcome data or insufficient essential characteristic information were removed from the study.</p>
            </sec>
            <sec id="sec9">
                <title>Outcome definition</title>
                <p>The primary outcome of this study was the onset of incident CKD after the diagnostic date of T2DM. CKD was established by a combination of diagnostic and laboratory criteria to guarantee thorough case identification. A patient was classified as having developed CKD if they met one of the following criteria: (1) a new ICD-10 diagnosis code for CKD (N18.1&#x2013;N18.9, Z49 for dialysis care, or Z99.2 for renal dialysis dependence) recorded during the follow-up period, or (2) laboratory evidence of renal dysfunction noted in structured clinical documentation, defined as an estimated glomerular filtration rate (eGFR) below 60&#x00a0;mL/min/1.73m
                    <sup>2</sup> or the presence of albuminuria. Patients were monitored from the index date until the occurrence of the earliest event: diagnosis of CKD, death, conclusion of accessible data (December 31, 2023), or the last recorded clinical visit, whichever transpired first.</p>
            </sec>
            <sec id="sec10">
                <title>Data collection and input features</title>
                <p>
Data at the patient level were retrieved from the BPJS Prolanis database, including demographic details, clinical comorbidities, drug prescriptions, and available laboratory findings. Demographic characteristics encompassed age as a continuous variable (quantified in years), sex (male or female), and overweight or obesity status, defined as a body mass index of 25&#x00a0;kg/m
                    <sup>2</sup> or higher when such data were accessible. Comorbidities were detected with ICD-10 codes recorded on the index date or within the year prior; they encompassed hypertension, cardiovascular disease, heart failure, and the aggregate of unique diagnoses as an indicator of overall illness burden. Medication utilisation was assessed using Anatomical Therapeutic Chemical (ATC) codes, with a patient deemed exposed if they had a minimum of one prescription fill for a specific medication within the six months before to or subsequent to the index date. Medications of interest encompassed antidiabetic therapies (notably rapid-acting insulin analogues), aspirin, proton pump inhibitors (PPIs), non-steroidal anti-inflammatory drugs (NSAIDs), amlodipine, furosemide, folic acid, and other antihypertensive or antiplatelet agents. Laboratory data, particularly blood urea nitrogen (BUN) levels, were obtained where accessible. All features, with the exception of age, were binarized to indicate presence or absence, although age was maintained as a continuous variable. Only instances with complete data on essential features and the outcome were included in the final analysis.</p>
            </sec>
            <sec id="sec11">
                <title>Model development and internal validation</title>
                <p>We developed and internally validated six ML algorithms aimed to predict incident CKD: Logistic Regression, Random Forest, Decision Tree, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and Categorical Boosting (CatBoost). All models were executed utilising the PyCaret open-source machine learning package in Python. The complete dataset was randomly divided into a training set consisting of 80% of the patients and an internal validation (test) set including the remaining 20%, with stratification by outcome status to maintain the proportion of CKD cases in both subsets. Hyperparameters for each algorithm were calibrated using 10-fold cross-validation on the training dataset to enhance model performance and mitigate overfitting. Subsequent to hyperparameter adjustment, each model was trained on the training set and subsequently evaluated on the hold-out test set to determine its predictive efficacy on novel data. The CatBoost model, exhibiting superior overall performance, was chosen as the final model for subsequent interpretation and feature importance analysis.</p>
            </sec>
            <sec id="sec12">
                <title>Performance metrics</title>
                <p>The model&#x2019;s performance was evaluated using a thorough array of conventional classification measures to facilitate comparison among the six algorithms. Accuracy was determined as the ratio of right predictions (true positives plus true negatives) to total predictions. Precision, defined as the ratio of true positives to the sum of true positives and false positives, was employed to reduce false-positive predictions. Recall, or sensitivity, was computed as the ratio of true positives to the total of true positives and false negatives, indicating the model&#x2019;s proficiency in accurately identifying actual CKD cases. The F1 score, defined as the harmonic mean of precision and recall, is calculated as 2 times (precision &#x00d7; recall) divided by (precision + recall), offering a balanced assessment of model accuracy that considers both false positives and false negatives. The area under the receiver operating characteristic curve (ROC-AUC) was employed to evaluate the model&#x2019;s capacity to differentiate between patients who developed CKD and those who did not across various classification thresholds; an AUC of 0.5 signifies random performance, whereas an AUC of 1.0 denotes perfect discrimination.</p>
            </sec>
            <sec id="sec13">
                <title>Feature importance and explainability</title>
                <p>SHapley Additive exPlanations (SHAP) were employed to explain model predictions and assess the contribution of each feature to the ultimate model output. A summary plot was created to illustrate the ten most significant elements. In the SHAP summary graphic, red signifies elevated feature values (augmenting CKD probability), whereas blue denotes diminished feature values (reducing CKD probability). SHAP values were calculated for the CatBoost model, which exhibited the greatest ROC-AUC among all evaluated algorithms.</p>
            </sec>
            <sec id="sec14">
                <title>Statistical analysis</title>
                <p>All statistical analyses were conducted utilising Python (version 3.9) alongside the PyCaret, scikit-learn, and SHAP libraries. Continuous data are expressed as mean&#x00a0;&#x00b1;&#x00a0;standard deviation (SD), whereas categorical variables are represented as frequencies and percentages. Baseline parameters were compared between individuals who developed CKD and those who did not, utilising independent t-tests for continuous variables and chi-square testing for categorical data. A two-tailed p-value of less than 0.05 was deemed statistically significant. No correction for multiple comparisons was implemented owing to the exploratory character of the model construction.</p>
            </sec>
            <sec id="sec15">
                <title>Ethics approval and consent to participate</title>
                <p>The study utilised de-identified secondary data obtained from the BPJS Kesehatan Prolanis database. Ethical approval was obtained from the Institutional Review Board of the Faculty of Public Health at Diponegoro University (approval number: 1.EA/KEPK-FKM/2026). The ethics committee waived the requirement for informed consent due to the study&#x2019;s reliance on retrospective analysis of anonymised secondary data, which entailed no direct interaction with human participants and did not provide researchers with any identifying information at any point. Consequently, no written nor verbal informed permission was acquired, in accordance with the committee&#x2019;s waiver.</p>
            </sec>
        </sec>
        <sec id="sec16" sec-type="results">
            <title>Results</title>
            <sec id="sec17">
                <title>Study population and baseline characteristics</title>
                <p>A total of 7,581 patients were included in the final analysis (
                    <xref ref-type="fig" rid="f1">
Figure 1</xref>). The average age of the cohort was 54.2&#x00a0;years (SD&#x00a0;&#x00b1;&#x00a0;9.0), with a predominance of females (54.1%). The average number of hospital visits per patient was 90.8, and the average number of recorded diagnoses was 30.6. 27.2% of the population exhibited overweight or obesity. Hypertension was the predominant comorbidity at 9.7%, succeeded by cardiovascular disease at 5.1% and heart failure at 0.3%. Antidiabetics were prescribed to 85.6% of patients, aspirin to 51.2%, proton pump inhibitors to 17.4%, and NSAIDs to 1.6%. CKD was identified in 864 patients (11.4%) (
                    <xref ref-type="table" rid="T1">
Table 1</xref>).</p>
                <fig fig-type="figure" id="f1" orientation="portrait" position="float">
                    <label>
Figure 1. </label>
                    <caption>
                        <title>Participant flow chart.</title>
                        <p>N = number of patients; CKD = chronic kidney disease.</p>
                    </caption>
                    <graphic id="gr1" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/202964/7acd0e53-c9ea-4a4d-bfa0-8e44e38c8272_figure1.gif"/>
                </fig>
                <table-wrap id="T1" orientation="portrait" position="float">
                    <label>
Table 1. </label>
                    <caption>
                        <title>Demographic and outcome characteristic.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Variables</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">
n (%)</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Total patients</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">7581 (100)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Age, years (mean&#x00a0;&#x00b1;&#x00a0;SD)</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">54.2&#x00a0;&#x00b1;&#x00a0;9.0</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Gender</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Male</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3477 (45.9)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Female</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">4104 (54.1)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Visit numbers, mean</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">90.8</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Overweight/Obesity</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">2061 (27.2)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Comorbidity</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Hypertension</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">732 (9.7)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Cardiovascular disease</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">385 (5.1)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Heart failure</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">22 (0.3)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Diagnoses count, mean</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">30.6</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Drugs</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Antidiabetics</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">6493 (85.6)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Aspirin</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">3879 (51.2)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Proton Pump Inhibitors</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">1320 (17.4)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;NSAIDs</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">125 (1.6)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">Outcome of having CKD</td>
                                <td colspan="1" rowspan="1"/>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;Yes</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">864 (11.4)</td>
                            </tr>
                            <tr>
                                <td align="left" colspan="1" rowspan="1" valign="top">&#x2003;No</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">6717 (88.6)</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <p>CKD&#x00a0;=&#x00a0;Chronic Kidney Disease; n&#x00a0;=&#x00a0;number; NSAIDs&#x00a0;=&#x00a0;Non Steroid Anti Inflammatory Drugs; SD&#x00a0;=&#x00a0;standard deviation.</p>
                    </table-wrap-foot>
                </table-wrap>
            </sec>
            <sec id="sec18">
                <title>Model development and internal validation</title>
                <p>Six machine learning algorithms were developed and subjected to internal validation. The CatBoost classifier attained the greatest ROC-AUC of 0.847, succeeded by Random Forest at 0.840, LightGBM at 0.836, Logistic Regression at 0.826, XGBoost at 0.821, and Decision Tree at 0.697. Regarding accuracy, Random Forest exhibited the highest performance at 0.810, whilst both CatBoost and Logistic Regression attained an accuracy of 0.797. CatBoost exhibited a precision of 0.643, a recall of 0.525, and an F1 score of 0.578 (
                    <xref ref-type="table" rid="T2">
Table 2</xref>). The receiver operating characteristic (ROC) curves for all six models indicated that CatBoost had the highest true-positive rate over the majority of false-positive rate thresholds (
                    <xref ref-type="fig" rid="f2">
Figure 2</xref>).</p>
                <table-wrap id="T2" orientation="portrait" position="float">
                    <label>
Table 2. </label>
                    <caption>
                        <title>Performance comparison of the six machine learning models.</title>
                    </caption>
                    <table content-type="article-table" frame="hsides">
                        <thead>
                            <tr>
                                <th align="left" colspan="1" rowspan="1" valign="top">Model</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Accuracy</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Precision</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">Recall</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">F1</th>
                                <th align="left" colspan="1" rowspan="1" valign="top">ROC_AUC</th>
                            </tr>
                        </thead>
                        <tbody>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Lr</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.796992</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.651376</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.503546</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.568000</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.826196</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Rf</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.810150</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.663934</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.574468</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.615970</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.839900</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Dt</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.757519</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.540541</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.567376</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.553633</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.696731</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Xgboost</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.798872</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.639344</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.553191</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.593156</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.820736</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Lgbm</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.791353</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.622951</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.539007</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.577947</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.835673</td>
                            </tr>
                            <tr>
                                <td align="center" colspan="1" rowspan="1" valign="middle">Catboost</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.796992</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.643478</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.524823</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.578125</td>
                                <td align="left" colspan="1" rowspan="1" valign="top">0.847373</td>
                            </tr>
                        </tbody>
                    </table>
                    <table-wrap-foot>
                        <p>Catboost&#x00a0;=&#x00a0;Categorical Boosting; Dt&#x00a0;=&#x00a0;Decision Tree; F1&#x00a0;=&#x00a0;F1-score; Lgbm&#x00a0;=&#x00a0;Light Gradient Boosting Machine; Lr&#x00a0;=&#x00a0;Logistic Regression; Rf&#x00a0;=&#x00a0;Random Forest; ROC_AUC&#x00a0;=&#x00a0;Area Under the Receiver Operating Characteristic Curve; Xgboost&#x00a0;=&#x00a0;Extreme Gradient Boosting.</p>
                    </table-wrap-foot>
                </table-wrap>
                <fig fig-type="figure" id="f2" orientation="portrait" position="float">
                    <label>
Figure 2. </label>
                    <caption>
                        <title>Receiver operating characteristic curve of top 5 model.</title>
                        <p>AUC = area under the curve; ROC = receiver operating characteristic curve.</p>
                    </caption>
                    <graphic id="gr2" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/202964/7acd0e53-c9ea-4a4d-bfa0-8e44e38c8272_figure2.gif"/>
                </fig>
            </sec>
            <sec id="sec19">
                <title>Feature importance and SHAP analysis</title>
                <p>SHAP research determined the ten most significant features influencing model predictions. Elevated levels (red) of rapid-acting insulin analogue utilisation were significantly correlated with a heightened likelihood of CKD, whereas diminished levels (blue) were linked to a reduced risk. Likewise, elevated levels of amlodipine, furosemide, folic acid, and BUN augmented the probability of CKD prediction. Conversely, advanced age, chronic ischaemic heart disease, and conditions affecting the pulp and periapical tissues were linked to a diminished predicted chance of CKD (
                    <xref ref-type="fig" rid="f3">
Figure 3</xref>). The protective association of chronic ischaemic heart disease and dental pulp problems may indicate healthcare-seeking behaviours or unaccounted confounding factors in claims-based
 data.</p>
                <fig fig-type="figure" id="f3" orientation="portrait" position="float">
                    <label>
Figure 3. </label>
                    <caption>
                        <title>Shapley additive explanations analysis summary plot for top 10 feature important.</title>
                        <p>The red color in upper right side indicating the feature that has more possibility on the development of cognitive impairment. The blue upper left side indicating the features that has less possibility on the development of cognitive impairment.</p>
                    </caption>
                    <graphic id="gr3" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/202964/7acd0e53-c9ea-4a4d-bfa0-8e44e38c8272_figure3.gif"/>
                </fig>
            </sec>
            <sec id="sec20">
                <title>Web-based calculator</title>
                <p>A user-friendly web-based risk calculator was built utilising the CatBoost model to enhance clinical application. The calculator enables clinicians to input patient demographics, comorbidities, medications, and BUN values to get a personalised CKD risk assessment. The interface presents the primary contributing elements for each prediction, hence improving model transparency and clinical utility (
                    <xref ref-type="fig" rid="f4">
Figure 4</xref>).</p>
                <fig fig-type="figure" id="f4" orientation="portrait" position="float">
                    <label>
Figure 4. </label>
                    <caption>
                        <title>The web-based machine learning calculator system.</title>
                        <p>The data entry for &#x201c;Medication&#x201d; and &#x201c;Diagnoses&#x201d; can be more than one.</p>
                    </caption>
                    <graphic id="gr4" orientation="portrait" position="float" xlink:href="https://f1000research-files.f1000.com/manuscripts/202964/7acd0e53-c9ea-4a4d-bfa0-8e44e38c8272_figure4.gif"/>
                </fig>
            </sec>
        </sec>
        <sec id="sec21" sec-type="discussion">
            <title>Discussion</title>
            <p>To the best of our knowledge, this study is among the first to develop and validate a web-based machine learning tool for predicting incident CKD in patients with T2DM participating in Prolanis program in Indonesia. In this study of 7,581 patients, we established that the CatBoost classifier attained the superior discriminative performance, evidenced by a ROC-AUC of 0.847, an accuracy of 0.797, and an F1 score of 0.578. The findings indicate that machine learning models, especially gradient boosting algorithms, might function as efficient screening tools able to identify high-risk T2DM patients who could benefit from early nephroprotective therapies.</p>
            <p>Our results align with prior research utilising machine learning to predict CKD in diabetic cohorts. A thorough evaluation indicated that machine learning models for predicting diabetic kidney disease generally attain AUC values between 0.80 and 0.90, with gradient boosting techniques frequently surpassing conventional logistic regression.
                <sup>
                    <xref ref-type="bibr" rid="ref19">19</xref>
                </sup> A study by Song et al. (2020) utilising the Korean National Health Insurance Service database revealed that XGBoost attained an AUC of 0.84 for predicting CKD progression in T2DM patients, closely corresponding to our CatBoost AUC of 0.847.
                <sup>
                    <xref ref-type="bibr" rid="ref20">20</xref>
                </sup> The equivalent efficacy of CatBoost in our analysis underscores the increasing agreement that ensemble tree-based techniques are adept at managing the high-dimensional, heterogeneous clinical data characteristic of real-world electronic health records.</p>
            <p>However, our model&#x2019;s recall (0.525) and F1 score (0.578) were moderate, suggesting that although the model has reasonable overall accuracy, it exhibits limited sensitivity in identifying all real CKD cases. This phenomenon is prevalent in imbalanced datasets where the result (CKD) manifests in merely 11.4% of the population, as observed in our cohort. Similar issues have been documented in other studies; for example, research utilising a Japanese claims database indicated a recall of 0.58 for CKD prediction employing random forests.
                <sup>
                    <xref ref-type="bibr" rid="ref23">23</xref>
                </sup> These findings emphasise the necessity for prudence in utilising such models as exclusive decision-making instruments and stress the significance of integrating machine learning predictions with clinical expertise.</p>
            <p>Our SHAP analysis discovered numerous significant predictors of incident CKD, many of which are physiologically plausible and align with current clinical knowledge. The utilisation of rapid-acting insulin analogues was one of the most significant predictors of CKD progression, with elevated values (shown in red on the SHAP summary plot) correlating with a heightened chance of CKD occurrence. This finding presumably indicates confounding by indication rather than a direct nephrotoxic effect of insulin. Patients necessitating rapid-acting insulin generally exhibit prolonged diabetes duration, suboptimal glycaemic management, and heightened insulin resistance&#x2014;each serving as separate risk factors for diabetic kidney damage. The United Kingdom Prospective Diabetes Study (UKPDS) established that intensive glucose-lowering therapy, encompassing insulin, mitigated the progression of microalbuminuria, indicating that insulin administration serves as an indicator of disease severity rather than a causative factor in CKD.
                <sup>
                    <xref ref-type="bibr" rid="ref24">24</xref>,
                    <xref ref-type="bibr" rid="ref25">25</xref>
                </sup>
            </p>
            <p>The finding that the usage of amlodipine and furosemide elevates the probability of CKD can be interpreted as a reflection of the underlying disease burden. Amlodipine, a calcium channel blocker, is frequently used for hypertension, which impacts roughly 9.7% of our cohort and serves as a significant risk factor for the advancement of CKD. Furosemide, a loop diuretic, is frequently used to patients experiencing fluid overload, which may indicate deteriorating renal function. Although certain experimental studies have suggested that calcium channel blockers may expose glomerular capillaries to elevated systemic pressures, findings from major clinical trials, including the African American Study of Kidney Disease and Hypertension (AASK), indicate that CCB-based therapy may provide less renal protection compared with renin&#x2013;angiotensin system blockade, particularly in patients with hypertensive CKD.
                <sup>
                    <xref ref-type="bibr" rid="ref26">26</xref>
                </sup> Consequently, while these agents remain appropriate when clinically indicated, their use&#x2014;especially as monotherapy&#x2014;should be accompanied by careful monitoring of renal function, given their comparatively limited renoprotective effects.</p>
            <p>In contrast, our SHAP analysis indicated that chronic ischaemic heart disease and conditions affecting the pulp and periapical tissues seemed to confer protective benefits against CKD, a result that necessitates meticulous interpretation. The observed protective effect of chronic ischaemic heart disease may be attributed to healthcare utilisation bias. Patients with diagnosed cardiovascular disease generally experience more frequent clinician visits, enhanced medication adherence (including antihypertensive and antiplatelet medications), and more rigorous control of risk factors than patients without these diagnoses. Research indicates that a high adherence rate to antihypertensive drugs (&#x2265;80%) correlates with a 33% decrease in the risk of end-stage renal disease.
                <sup>
                    <xref ref-type="bibr" rid="ref27">27</xref>&#x2013;
                    <xref ref-type="bibr" rid="ref29">29</xref>
                </sup> Likewise, the protective influence of pulp and periapical tissue illnesses may indicate superior health-seeking behaviour, as individuals who obtain regular dental care are likely more proactive in managing diabetes.
                <sup>
                    <xref ref-type="bibr" rid="ref30">30</xref>
                </sup> Conversely, non-surgical periodontal therapy has demonstrated the capacity to diminish systemic inflammation, perhaps decelerating the progression of CKD, although residual confounding remains a possibility.</p>
            <p>The CatBoost model exhibited strong internal validation performance, with a ROC-AUC of 0.847, which is advantageous compared to previously published prediction methods for CKD. A systematic study assessed CKD prediction models and found that most attained AUC values ranging from 0.70 to 0.85,
                <sup>
                    <xref ref-type="bibr" rid="ref17">17</xref>
                </sup> positioning our model within the higher spectrum of available methods. Furthermore, our utilisation of standard administrative claims data&#x2014;rather than specialised laboratory assessments or imaging&#x2014;augments the model&#x2019;s scalability and practical relevance in low- and middle-income contexts such as Indonesia, where access to advanced diagnostic testing may be constrained.</p>
            <p>Nonetheless, particular aspects of model performance warrant examination. The CatBoost model&#x2019;s precision (0.643) significantly exceeded its recall (0.525), signifying that while the model predicts CKD, it is accurate about 64% of the time, although it fails to identify nearly half of the actual CKD cases. The compromise between precision and recall is permissible in a screening environment, when the objective is to identify a group of high-risk patients for confirmatory tests rather than to establish definite diagnoses.
                <sup>
                    <xref ref-type="bibr" rid="ref31">31</xref>
                </sup> The web-based tool we created enables doctors to modify the classification threshold according to local resources and preferences; for example, a reduced threshold enhances recall (identifying more true cases) while compromising precision (resulting in more false positives necessitating follow-up testing).</p>
            <p>This study possesses numerous significant strengths. The utilisation of an extensive, real-world dataset from Indonesia&#x2019;s national health insurance program (BPJS) ensures significant validity for the Indonesian population and presents a model that can be incorporated into current digital health frameworks. The incorporation of several machine learning methods, accompanied by systematic hyperparameter optimisation and internal validation, adheres to best-practice guidelines for the building of predictive models. Third, employing SHAP analysis improves model interpretability, countering a prevalent critique of &#x201c;black box&#x201d; machine learning models in clinical medicine.
                <sup>
                    <xref ref-type="bibr" rid="ref21">21</xref>
                </sup> The development of an intuitive web-based calculator enables prospective integration into clinical practice.</p>
            <p>However, some limitations must be recognised. The retrospective cohort design includes potential biases associated with secondary data analysis, such as indication bias (as noted with insulin and antihypertensive drugs) and detection bias (patients with more frequent visits are more likely to receive a diagnosis of CKD). Secondly, the dataset was constrained to variables typically gathered in claims data; we could not include significant clinical factors such as smoking status, alcohol intake, physical activity, dietary habits, family history of kidney disease, comprehensive blood pressure readings, haemoglobin A1c levels, or urine albumin-to-creatinine ratios. The lack of these recognised risk indicators may have constrained model performance, especially recall. Third, the utilisation of ICD-10 codes for outcome determination may have led to misclassification bias, given chronic kidney disease is recognised to be under-represented in administrative databases. However, our composite definition, which incorporates laboratory criteria, partially alleviates this problem. The dataset was derived from a singular health insurance program in Indonesia, perhaps constraining its generalisability to other populations with varying genetic backgrounds, healthcare systems, and practice patterns. External validation with separate datasets from other areas or nations is crucial prior to extensive implementation. Fifth, the complete-case analysis, which excluded patients with missing data, may have created selection bias if the missingness was not entirely random. Sixth, we did not conduct external validation utilising a temporally or geographically separate dataset, which is a crucial subsequent step to verify the model&#x2019;s generalisability and resilience against overfitting. Finally, The moderate recall noted in this study indicates the intrinsically uneven character of the dataset, considering the 11.4% prevalence of CKD. To address this in future iterations, technical solutions such as employing sophisticated oversampling methods (e.g., Synthetic Minority Over-sampling Technique, SMOTE) or systematically modifying algorithmic class weights could be investigated to enhance model sensitivity.</p>
            <p>
Future study must emphasise the external validation of the CatBoost model with independent datasets from various healthcare systems in Southeast Asia and beyond. Prospective validation studies would evaluate the model&#x2019;s real-world efficacy and clinical applicability, encompassing its influence on clinician decision-making and patient outcomes. Subsequent model enhancement could integrate supplementary predictors, including longitudinal changes in eGFR, albuminuria, and haemoglobin A1c, thereby augmenting predictive precision. Ultimately, interventional studies are required to ascertain if machine learning-guided risk classification results in earlier nephrology referrals, enhanced blood pressure and glucose management, and ultimately a decreased incidence of end-stage renal disease in high-risk T2DM patients.</p>
        </sec>
        <sec id="sec22" sec-type="conclusions">
            <title>Conclusions</title>
            <p>This study effectively constructed and internally validated a CatBoost-based machine learning model for predicting incident CKD in patients with T2DM, utilising routinely obtained claims data from Indonesia&#x2019;s BPJS Prolanis program. The model exhibited strong discriminative capability (AUC 0.847) and recognised clinically significant risk factors, such as rapid-acting insulin use, amlodipine, furosemide, and increased BUN levels. The model&#x2019;s moderate recall indicates potential for enhancement, yet its high precision and interpretability highlight its utility as a preliminary screening tool to inform clinical suspicion and identify high-risk individuals for targeted preventive interventions, rather than serving as an independent diagnostic instrument. A web-based calculator was created to enhance clinical application. This web-based calculator is ideal for Indonesia&#x2019;s primary care digital environment as a preliminary screening tool. The CatBoost algorithm embedded directly into the national BPJS P-Care system would allow general practitioners to perform real-time risk stratification during routine visits and provide timely nephroprotective interventions for high-risk type 2 diabetes patients in the Prolanis program. Future external validation and prospective implementation studies are necessary prior to extensive clinical deployment.</p>
        </sec>
    </body>
    <back>
        <sec id="sec25" sec-type="data-availability">
            <title>Data availability statement</title>
            <p>All datasets supporting this article are accessible via the following link: 
                <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.19634415">https://doi.org/10.5281/zenodo.19634415</ext-link>.
                <sup>
                    <xref ref-type="bibr" rid="ref32">32</xref>
                </sup>
            </p>
            <p>Data are available under the terms of the 
                <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International</ext-link>.</p>
            <sec id="sec26">
                <title>Extended data</title>
                <p>Zenodo: Supplementary data: 
                    <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.19521819">https://doi.org/10.5281/zenodo.19521819</ext-link>.
                    <sup>
                        <xref ref-type="bibr" rid="ref33">33</xref>
                    </sup>
                </p>
                <p>This project contains the following extended data:
                    <list list-type="bullet">
                        <list-item>
                            <label>&#x2022;</label>
                            <p>Tripod Checklist.docx.</p>
                        </list-item>
                    </list>
                </p>
                <p>Data are available under the terms of the 
                    <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International</ext-link>.</p>
            </sec>
        </sec>
        <ack>
            <title>Acknowledgements</title>
            <p>We thank BPJS Kesehatan for access to the Prolanis dataset and the Faculty of Public Health, Universitas Diponegoro, for administrative and technical support.</p>
        </ack>
        <ref-list>
            <title>References</title>
            <ref id="ref1">
                <label>1</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Francis</surname>
                            <given-names>A</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Harhay</surname>
                            <given-names>MN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ong</surname>
                            <given-names>ACM</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Chronic kidney disease and the global public health agenda: an international consensus.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Nephrol.</italic>
</source>
                    <year>2024</year>;<volume>20</volume>(<issue>7</issue>):<fpage>473</fpage>&#x2013;<lpage>485</lpage>.
                    <pub-id pub-id-type="pmid">38570631</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41581-024-00820-6</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref2">
                <label>2</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ke</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liang</surname>
                            <given-names>J</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Liu</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Burden of chronic kidney disease and its risk-attributable burden in 137 low-and middle-income countries, 1990&#x2013;2019: results from the global burden of disease study 2019.</article-title>
                    <source>

                        <italic toggle="yes">BMC Nephrol.</italic>
</source>
                    <year>2022</year>;<volume>23</volume>(<issue>1</issue>):<fpage>17</fpage>.
                    <pub-id pub-id-type="pmid">34986789</pub-id>
                    <pub-id pub-id-type="doi">10.1186/s12882-021-02597-3</pub-id>
                    <pub-id pub-id-type="pmcid">PMC8727977</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref3">
                <label>3</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Bikbov</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Purcell</surname>
                            <given-names>CA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Levey</surname>
                            <given-names>AS</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Global, regional, and national burden of chronic kidney disease, 1990&#x2013;2017: a systematic analysis for the Global Burden of Disease Study 2017.</article-title>
                    <source>

                        <italic toggle="yes">Lancet.</italic>
</source>
                    <year>2020</year>;<volume>395</volume>(<issue>10225</issue>):<fpage>709</fpage>&#x2013;<lpage>733</lpage>.
                    <pub-id pub-id-type="pmid">32061315</pub-id>
                    <pub-id pub-id-type="doi">10.1016/S0140-6736(20)30045-3</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7049905</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref4">
                <label>4</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Kovesdy</surname>
                            <given-names>CP</given-names>
                        </name>
</person-group>:
                    <article-title>Epidemiology of chronic kidney disease: an update 2022.</article-title>
                    <source>

                        <italic toggle="yes">Kidney Int. Suppl.</italic>
</source>
                    <year>2022</year>;<volume>12</volume>(<issue>1</issue>):<fpage>7</fpage>&#x2013;<lpage>11</lpage>.
                    <pub-id pub-id-type="pmid">35529086</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.kisu.2021.11.003</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9073222</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref5">
                <label>5</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hustrini</surname>
                            <given-names>NM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Susalit</surname>
                            <given-names>E</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Harimurti</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Prevalence, incidence and risk factors of chronic kidney disease in people with diabetes and hypertension, and the prognosis and kidney function decline in Indonesia: a multicentre cross-sectional study in primary care centres.</article-title>
                    <source>

                        <italic toggle="yes">BMJ Open.</italic>
</source>
                    <year>2025</year>;<volume>15</volume>(<issue>10</issue>):<fpage>e103779</fpage>.
                    <pub-id pub-id-type="pmid">41125263</pub-id>
                    <pub-id pub-id-type="doi">10.1136/bmjopen-2025-103779</pub-id>
                    <pub-id pub-id-type="pmcid">PMC12548581</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref6">
                <label>6</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Duncan</surname>
                            <given-names>BB</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Magliano</surname>
                            <given-names>DJ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Boyko</surname>
                            <given-names>EJ</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">IDF diabetes atlas 11th edition 2025: global prevalence and projections for 2050.</italic>
</source>
                    <publisher-name>Oxford University Press</publisher-name>;<year>2025</year>; vol.<volume>41</volume>:<fpage>7</fpage>&#x2013;<lpage>9</lpage>.
                    <pub-id pub-id-type="doi">10.1093/ndt/gfaf177</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref7">
                <label>7</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Johnston-Webber</surname>
                            <given-names>C</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Bencomo-Bermudez</surname>
                            <given-names>I</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wharton</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A conceptual framework to assess the health, socioeconomic and environmental burden of chronic kidney disease.</article-title>
                    <source>

                        <italic toggle="yes">Health Policy.</italic>
</source>
                    <year>2025</year>;<volume>152</volume>:<fpage>105244</fpage>.
                    <pub-id pub-id-type="pmid">39827831</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.healthpol.2024.105244</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref8">
                <label>8</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Jha</surname>
                            <given-names>V</given-names>
                        </name>

                        <name name-style="western">
                            <surname>al-Ghamdi</surname>
                            <given-names>SMG</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>G</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Global economic burden associated with chronic kidney disease: a pragmatic review of medical costs for the inside CKD research programme.</article-title>
                    <source>

                        <italic toggle="yes">Adv. Ther.</italic>
</source>
                    <year>2023</year>;<volume>40</volume>(<issue>10</issue>):<fpage>4405</fpage>&#x2013;<lpage>4420</lpage>.
                    <pub-id pub-id-type="pmid">37493856</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s12325-023-02608-9</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10499937</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref9">
                <label>9</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Thomas</surname>
                            <given-names>MC</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Brownlee</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Susztak</surname>
                            <given-names>K</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Diabetic kidney disease.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Rev. Dis. Prim.</italic>
</source>
                    <year>2015</year>;<volume>1</volume>(<issue>1</issue>):<fpage>15018</fpage>.
                    <pub-id pub-id-type="pmid">27188921</pub-id>
                    <pub-id pub-id-type="doi">10.1038/nrdp.2015.18</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7724636</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref10">
                <label>10</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Thomas</surname>
                            <given-names>B</given-names>
                        </name>
</person-group>:
                    <article-title>The global burden of diabetic kidney disease: time trends and gender gaps.</article-title>
                    <source>

                        <italic toggle="yes">Curr. Diab. Rep.</italic>
</source>
                    <year>2019</year>;<volume>19</volume>(<issue>4</issue>):<fpage>18</fpage>.
                    <pub-id pub-id-type="pmid">30826889</pub-id>
                    <pub-id pub-id-type="doi">10.1007/s11892-019-1133-6</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref11">
                <label>11</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hoogeveen</surname>
                            <given-names>EK</given-names>
                        </name>
</person-group>:
                    <article-title>The epidemiology of diabetic kidney disease.</article-title>
                    <source>

                        <italic toggle="yes">Kidney and Dialysis.</italic>
</source>
                    <year>2022</year>;<volume>2</volume>(<issue>3</issue>):<fpage>433</fpage>&#x2013;<lpage>442</lpage>.
                    <pub-id pub-id-type="doi">10.3390/kidneydial2030038</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref12">
                <label>12</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Chen</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>K</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ni</surname>
                            <given-names>Z</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Diabetic kidney disease: challenges, advances, and opportunities.</article-title>
                    <source>

                        <italic toggle="yes">Kidney diseases.</italic>
</source>
                    <year>2020</year>;<volume>6</volume>(<issue>4</issue>):<fpage>215</fpage>&#x2013;<lpage>225</lpage>.
                    <pub-id pub-id-type="pmid">32903946</pub-id>
                    <pub-id pub-id-type="doi">10.1159/000506634</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7445658</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref13">
                <label>13</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Alicic</surname>
                            <given-names>RZ</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Rooney</surname>
                            <given-names>MT</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Tuttle</surname>
                            <given-names>KR</given-names>
                        </name>
</person-group>:
                    <article-title>Diabetic kidney disease: challenges, progress, and possibilities.</article-title>
                    <source>

                        <italic toggle="yes">Clin. J. Am. Soc. Nephrol.</italic>
</source>
                    <year>2017</year>;<volume>12</volume>(<issue>12</issue>):<fpage>2032</fpage>&#x2013;<lpage>2045</lpage>.
                    <pub-id pub-id-type="pmid">28522654</pub-id>
                    <pub-id pub-id-type="doi">10.2215/CJN.11491116</pub-id>
                    <pub-id pub-id-type="pmcid">PMC5718284</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref14">
                <label>14</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hauwanga</surname>
                            <given-names>WN</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Abdalhamed</surname>
                            <given-names>TY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ezike</surname>
                            <given-names>LA</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The pathophysiology and vascular complications of diabetes in chronic kidney disease: A comprehensive review.</article-title>
                    <source>

                        <italic toggle="yes">Cureus.</italic>
</source>
                    <year>2024</year>;<volume>16</volume>(<issue>12</issue>).
                    <pub-id pub-id-type="doi">10.7759/cureus.76498</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref15">
                <label>15</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ding</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Andoh</surname>
                            <given-names>V</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>The mechanism of hyperglycemia-induced renal cell injury in diabetic nephropathy disease: an update.</article-title>
                    <source>

                        <italic toggle="yes">Life.</italic>
</source>
                    <year>2023</year>;<volume>13</volume>(<issue>2</issue>):<fpage>539</fpage>.
                    <pub-id pub-id-type="pmid">36836895</pub-id>
                    <pub-id pub-id-type="doi">10.3390/life13020539</pub-id>
                    <pub-id pub-id-type="pmcid">PMC9967500</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref16">
                <label>16</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Zhang</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Li</surname>
                            <given-names>W</given-names>
                        </name>
</person-group>:
                    <article-title>Association between the fatty liver index, metabolic dysfunction-associated steatotic liver disease, and the risk of kidney stones.</article-title>
                    <source>

                        <italic toggle="yes">Kidney Blood Press. Res.</italic>
</source>
                    <year>2025</year>;<volume>50</volume>(<issue>1</issue>):<fpage>115</fpage>&#x2013;<lpage>130</lpage>.
                    <pub-id pub-id-type="pmid">39746337</pub-id>
                    <pub-id pub-id-type="doi">10.1159/000543404</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11844708</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref17">
                <label>17</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Collins</surname>
                            <given-names>GS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Omar</surname>
                            <given-names>O</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Shanyinde</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>A systematic review finds prediction models for chronic kidney disease were poorly reported and often developed using inappropriate methods.</article-title>
                    <source>

                        <italic toggle="yes">J. Clin. Epidemiol.</italic>
</source>
                    <year>2013</year>;<volume>66</volume>(<issue>3</issue>):<fpage>268</fpage>&#x2013;<lpage>277</lpage>.
                    <pub-id pub-id-type="pmid">23116690</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jclinepi.2012.06.020</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref18">
                <label>18</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Feng</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>AY</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jun</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Characterization of risk prediction models for acute kidney injury: a systematic review and meta-analysis.</article-title>
                    <source>

                        <italic toggle="yes">JAMA Netw. Open.</italic>
</source>
                    <year>2023</year>;<volume>6</volume>(<issue>5</issue>):<fpage>e2313359</fpage>.
                    <pub-id pub-id-type="pmid">37184837</pub-id>
                    <pub-id pub-id-type="doi">10.1001/jamanetworkopen.2023.13359</pub-id>
                    <pub-id pub-id-type="pmcid">PMC12011341</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref19">
                <label>19</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ravizza</surname>
                            <given-names>S</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Huschto</surname>
                            <given-names>T</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Adamov</surname>
                            <given-names>A</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data.</article-title>
                    <source>

                        <italic toggle="yes">Nat. Med.</italic>
</source>
                    <year>2019</year>;<volume>25</volume>(<issue>1</issue>):<fpage>57</fpage>&#x2013;<lpage>59</lpage>.
                    <pub-id pub-id-type="pmid">30617317</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41591-018-0239-8</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref20">
                <label>20</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Song</surname>
                            <given-names>X</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Waitman</surname>
                            <given-names>LR</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yu</surname>
                            <given-names>ASL</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Longitudinal risk prediction of chronic kidney disease in diabetic patients using a temporal-enhanced gradient boosting machine: retrospective cohort study.</article-title>
                    <source>

                        <italic toggle="yes">JMIR Med. Inform.</italic>
</source>
                    <year>2020</year>;<volume>8</volume>(<issue>1</issue>):<fpage>e15510</fpage>.
                    <pub-id pub-id-type="pmid">32012067</pub-id>
                    <pub-id pub-id-type="doi">10.2196/15510</pub-id>
                    <pub-id pub-id-type="pmcid">PMC7055762</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref21">
                <label>21</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Lundberg</surname>
                            <given-names>SM</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Lee</surname>
                            <given-names>S-I</given-names>
                        </name>
</person-group>:
                    <article-title>A unified approach to interpreting model predictions.</article-title>
                    <source>

                        <italic toggle="yes">Adv. Neural Inf. Proces. Syst.</italic>
</source>
                    <year>2017</year>;<volume>30</volume>.</mixed-citation>
            </ref>
            <ref id="ref22">
                <label>22</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hasan</surname>
                            <given-names>F</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Muhtar</surname>
                            <given-names>MS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wu</surname>
                            <given-names>D</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Web-based artificial intelligence to predict cognitive impairment following stroke: A multicenter study.</article-title>
                    <source>

                        <italic toggle="yes">J. Stroke Cerebrovasc. Dis.</italic>
</source>
                    <year>2024</year>;<volume>33</volume>(<issue>8</issue>):<fpage>107826</fpage>.
                    <pub-id pub-id-type="pmid">38908612</pub-id>
                    <pub-id pub-id-type="doi">10.1016/j.jstrokecerebrovasdis.2024.107826</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref23">
                <label>23</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Makino</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Yoshimoto</surname>
                            <given-names>R</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Ono</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning.</article-title>
                    <source>

                        <italic toggle="yes">Sci. Rep.</italic>
</source>
                    <year>2019</year>;<volume>9</volume>(<issue>1</issue>):<fpage>11862</fpage>.
                    <pub-id pub-id-type="pmid">31413285</pub-id>
                    <pub-id pub-id-type="doi">10.1038/s41598-019-48263-5</pub-id>
                    <pub-id pub-id-type="pmcid">PMC6694113</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref24">
                <label>24</label>
                <mixed-citation publication-type="journal">
                    <collab>Group, U.P.D.S.</collab>:
                    <article-title>Intensive blood-glucose control with sulphonylureas or insulin compared with conventional treatment and risk of complications in patients with type 2 diabetes (UKPDS 33).</article-title>
                    <source>

                        <italic toggle="yes">Lancet.</italic>
</source>
                    <year>1998</year>;<volume>352</volume>(<issue>9131</issue>):<fpage>837</fpage>&#x2013;<lpage>853</lpage>.
                    <pub-id pub-id-type="doi">10.1016/S0140-6736(98)07019-6</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref25">
                <label>25</label>
                <mixed-citation publication-type="book">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Usman</surname>
                            <given-names>M</given-names>
                        </name>
</person-group>:
                    <source>

                        <italic toggle="yes">Multiple Risk Factor Control in Individuals with Type 2 Diabetes and Microalbuminuria.</italic>
</source>
                    <publisher-name>University of Leicester</publisher-name>;<year>2020</year>.</mixed-citation>
            </ref>
            <ref id="ref26">
                <label>26</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Clemmer</surname>
                            <given-names>JS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Pruett</surname>
                            <given-names>WA</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Hester</surname>
                            <given-names>RL</given-names>
                        </name>
</person-group>:
                    <article-title>Predicting chronic responses to calcium channel blockade with a virtual population of African Americans with hypertensive chronic kidney disease.</article-title>
                    <source>

                        <italic toggle="yes">Frontiers in Systems Biology.</italic>
</source>
                    <year>2024</year>;<volume>4</volume>:<fpage>1327357</fpage>.
                    <pub-id pub-id-type="pmid">39606582</pub-id>
                    <pub-id pub-id-type="doi">10.3389/fsysb.2024.1327357</pub-id>
                    <pub-id pub-id-type="pmcid">PMC11600446</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref27">
                <label>27</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Roy</surname>
                            <given-names>L</given-names>
                        </name>

                        <name name-style="western">
                            <surname>White-Guay</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Dorais</surname>
                            <given-names>M</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Adherence to antihypertensive agents improves risk reduction of end-stage renal disease.</article-title>
                    <source>

                        <italic toggle="yes">Kidney Int.</italic>
</source>
                    <year>2013</year>;<volume>84</volume>(<issue>3</issue>):<fpage>570</fpage>&#x2013;<lpage>577</lpage>.
                    <pub-id pub-id-type="pmid">23698228</pub-id>
                    <pub-id pub-id-type="doi">10.1038/ki.2013.103</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref28">
                <label>28</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ma</surname>
                            <given-names>B</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Jia</surname>
                            <given-names>Y</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Wang</surname>
                            <given-names>H</given-names>
                        </name>

                        <etal/>
</person-group>:
                    <article-title>Systematic Review and Network Meta-Analysis of the Comparative Effectiveness of Adherence Enhancement Strategies in Chronic Kidney Disease.</article-title>
                    <source>

                        <italic toggle="yes">J. Evid. Based Med.</italic>
</source>
                    <year>2025</year>;<volume>18</volume>(<issue>3</issue>):<fpage>e70078</fpage>.
                    <pub-id pub-id-type="pmid">40994060</pub-id>
                    <pub-id pub-id-type="doi">10.1111/jebm.70078</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref29">
                <label>29</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Burnier</surname>
                            <given-names>M</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Egan</surname>
                            <given-names>BM</given-names>
                        </name>
</person-group>:
                    <article-title>Adherence in hypertension: a review of prevalence, risk factors, impact, and management.</article-title>
                    <source>

                        <italic toggle="yes">Circ. Res.</italic>
</source>
                    <year>2019</year>;<volume>124</volume>(<issue>7</issue>):<fpage>1124</fpage>&#x2013;<lpage>1140</lpage>.
                    <pub-id pub-id-type="doi">10.1161/CIRCRESAHA.118.313220</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref30">
                <label>30</label>
                <mixed-citation publication-type="journal">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Ghanem</surname>
                            <given-names>AS</given-names>
                        </name>

                        <name name-style="western">
                            <surname>Nagy</surname>
                            <given-names>AC</given-names>
                        </name>
</person-group>:
                    <article-title>Oral health's role in diabetes risk: a cross-sectional study with sociodemographic and lifestyle insights.</article-title>
                    <source>

                        <italic toggle="yes">Front. Endocrinol. (Lausanne).</italic>
</source>
                    <year>2024</year>;<volume>15</volume>:<fpage>1342783</fpage>.
                    <pub-id pub-id-type="pmid">38516406</pub-id>
                    <pub-id pub-id-type="doi">10.3389/fendo.2024.1342783</pub-id>
                    <pub-id pub-id-type="pmcid">PMC10955347</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref31">
                <label>31</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Powers</surname>
                            <given-names>DM</given-names>
                        </name>
</person-group>:
                    <article-title>Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation.</article-title>
                    <source>

                        <italic toggle="yes">arXiv preprint arXiv:2010.16061.</italic>
</source>
                    <year>2020</year>.</mixed-citation>
            </ref>
            <ref id="ref32">
                <label>
32</label>
                <mixed-citation publication-type="data">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hasan</surname>
                            <given-names>F</given-names>
                        </name>
</person-group>:
                    <data-title>ML Dataset.</data-title>[Data set].
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2026</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.19634415</pub-id>
                </mixed-citation>
            </ref>
            <ref id="ref33">
                <label>
33</label>
                <mixed-citation publication-type="other">
                    <person-group person-group-type="author">

                        <name name-style="western">
                            <surname>Hasan</surname>
                            <given-names>F</given-names>
                        </name>
</person-group>:
                    <article-title>Tripod Checklist.</article-title>
                    <source>

                        <italic toggle="yes">Zenodo.</italic>
</source>
                    <year>2026</year>.
                    <pub-id pub-id-type="doi">10.5281/zenodo.19521819</pub-id>
                </mixed-citation>
            </ref>
        </ref-list>
    </back>
    <sub-article article-type="reviewer-report" id="report483727">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.198473.r483727</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Santana</surname>
                        <given-names>Ewaldo Eder Carvalho</given-names>
                    </name>
                    <xref ref-type="aff" rid="r483727a1">1</xref>
                    <role>Referee</role>
                </contrib>
                <aff id="r483727a1">
                    <label>1</label>State University of Maranh&#x00e3;o, S&#x00e3;o Lu&#x00ed;s, Brazil</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>4</day>
                <month>6</month>
                <year>2026</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Santana EEC</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport483727" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.179913.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>This study developed and internally validated a machine learning model (CatBoost) to predict incident chronic kidney disease (CKD) among patients with type 2 diabetes mellitus (T2DM) enrolled in Indonesia&#x2019;s national health insurance chronic disease management program (Prolanis).</p>
            <p> </p>
            <p> Using a retrospective cohort of 7,581 patients, the authors compared six algorithms and found CatBoost achieved the best discriminative performance (AUC = 0.847). They used SHAP analysis to improve interpretability and built a web&#x2011;based risk calculator for clinical use. The manuscript is well structured, follows TRIPOD guidelines, and addresses an important public health gap in low&#x2011; and middle&#x2011;income countries.&#x00a0;</p>
            <p> </p>
            <p> All my answers to the standard review questions were&#x00a0; YES,&#x00a0;meaning the study is scientifically sound, the methods are appropriate, the results are clearly presented, the discussion acknowledges limitations, and the conclusions are supported by the data.</p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Yes</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>Signal Processing; Machine Learning</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.</p>
        </body>
        <sub-article article-type="response" id="comment16357-483727">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>hasan</surname>
                            <given-names>faizul</given-names>
                        </name>
                        <aff>faculty of nursing, Chulalongkorn University, Bangkok, Bangkok, Thailand</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>none</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>4</day>
                    <month>6</month>
                    <year>2026</year>
                </pub-date>
            </front-stub>
            <body>
                <p>We sincerely appreciate your time and insightful evaluation.</p>
                <p> We sincerely appreciate your favorable evaluation and are heartened that you considered the study scientifically robust, the methodologies suitable, and the results well-founded. Your acknowledgement of this work's significance for low- and middle-income countries is particularly important to us.</p>
            </body>
        </sub-article>
    </sub-article>
    <sub-article article-type="reviewer-report" id="report483733">
        <front-stub>
            <article-id pub-id-type="doi">10.5256/f1000research.198473.r483733</article-id>
            <title-group>
                <article-title>Reviewer response for version 1</article-title>
            </title-group>
            <contrib-group>
                <contrib contrib-type="author">
                    <name>
                        <surname>Pramana</surname>
                        <given-names>Cipta</given-names>
                    </name>
                    <xref ref-type="aff" rid="r483733a1">1</xref>
                    <role>Referee</role>
                    <uri content-type="orcid">https://orcid.org/0000-0001-8991-0147</uri>
                </contrib>
                <aff id="r483733a1">
                    <label>1</label>UNNES - Universitas Negeri Semarang, Central Java, Indonesia</aff>
            </contrib-group>
            <author-notes>
                <fn fn-type="conflict">
                    <p>
                        <bold>Competing interests: </bold>No competing interests were disclosed.</p>
                </fn>
            </author-notes>
            <pub-date pub-type="epub">
                <day>27</day>
                <month>5</month>
                <year>2026</year>
            </pub-date>
            <permissions>
                <copyright-statement>Copyright: &#x00a9; 2026 Pramana C</copyright-statement>
                <copyright-year>2026</copyright-year>
                <license xlink:href="https://creativecommons.org/licenses/by/4.0/">
                    <license-p>This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</license-p>
                </license>
            </permissions>
            <related-article ext-link-type="doi" id="relatedArticleReport483733" related-article-type="peer-reviewed-article" xlink:href="10.12688/f1000research.179913.1"/>
            <custom-meta-group>
                <custom-meta>
                    <meta-name>recommendation</meta-name>
                    <meta-value>approve-with-reservations</meta-value>
                </custom-meta>
            </custom-meta-group>
        </front-stub>
        <body>
            <p>
                <bold>Comments for the Authors:</bold> 
                <list list-type="order">
                    <list-item>
                        <p>
                            <bold>Introduction (Citation Formatting):</bold> There are several typographical errors regarding text citations in the Introduction section. For instance, in paragraph 2, "3.14 Hyperglycemia triggers..." and "14.15 Well-defined risk factors..." appear to be misformatted reference numbers. Similarly, in paragraph 4, "...from 0.80 to 0.90.19.20" fuses the text with reference numbers 19 and 20. Please thoroughly revise the manuscript to ensure all citations strictly adhere to the journal's formatting guidelines.</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Abstract (Methodology):</bold> To improve clarity for a broader audience, please consider adding a brief statement regarding the data-splitting ratio (e.g., 80% training and 20% testing) in the Methods subsection of the Abstract, if the word count permits.</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Discussion/Conclusion Alignment:</bold> The authors correctly acknowledged the moderate recall (0.525) due to the imbalanced nature of the dataset. I highly recommend ensuring that the text in the Conclusion explicitly reinforces that this web-based calculator is intended as a preliminary screening tool to guide clinical suspicion rather than a definitive diagnostic instrument. Consistent terminology should also be checked (e.g., matching "ischaemic" vs "ischemic").</p>
                    </list-item>
                </list> 
                <bold>Methodology and Results Comments:</bold> 
                <list list-type="order">
                    <list-item>
                        <p>
                            <bold>Methodology (Participant Flowchart):</bold> While the inclusion and exclusion criteria are explicitly defined, the manuscript lacks a participant flow diagram. To adhere strictly to the TRIPOD statement referenced by the authors, please provide a flowchart illustrating how the initial cohort was filtered down to the final sample size of 7,581 patients (e.g., numbers of excluded patients due to prior CKD or missing data).</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Table 2 Title Typo:</bold> The title of Table 2 reads 
                            <italic>"Six nearest neighbor algorithms used in machine learning."</italic> This is factually incorrect as the models evaluated (CatBoost, XGBoost, Random Forest, etc.) are ensemble tree-based methods and logistic regression, not K-Nearest Neighbor algorithms. Please revise the title to accurately reflect the contents (e.g., 
                            <italic>"Performance comparison of the six machine learning models"</italic>).</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Results (Table 1 Reconstruction):</bold> In the statistical analysis section, you mentioned comparing baseline parameters between individuals who developed CKD and those who did not using t-tests and chi-square tests. However, Table 1 only displays the aggregate characteristics of the total population. Please reconstruct Table 1 to separate the data into "CKD" and "Non-CKD" groups and include the corresponding p-values to show baseline statistical significance.</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Inconsistent AUC Values:</bold> There is a minor inconsistency in how AUC values are reported. For example, CatBoost's AUC is reported as 0.847373 in Table 2, 0.847 in the main text, and 0.85 in Figure 1's legend. Please ensure consistency in decimal precision across the text, tables, and figures (3 decimal places is recommended, e.g., 0.847).</p>
                    </list-item>
                </list> 
                <bold>Discussion and Conclusion Comments for the Authors:</bold> 
                <list list-type="order">
                    <list-item>
                        <p>
                            <bold>Discussion (Typographical and Formatting Errors):</bold> There are a few minor formatting issues in the Discussion section. In paragraph 2 (Page 7), an erratic quotation mark appears inside the sentence: "...conventional logistic regression." A study by...". Additionally, on Page 9, a broken reference format is visible: "27- Likewise, the protective influence...". Please meticulously proofread the section to fix these citation alignment errors.</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Discussion (Addressing Imbalanced Data):</bold> The authors provided an excellent and honest justification regarding the moderate recall due to the imbalanced nature of the dataset (11.4% CKD prevalence). However, it would strengthen the paper if the authors briefly suggested future technical workarounds in the limitation paragraph&#x2014;such as utilizing oversampling techniques (e.g., SMOTE) or adjusting algorithmic class weights to improve sensitivity in future iterations.</p>
                    </list-item>
                    <list-item>
                        <p>
                            <bold>Conclusions &amp; Practical Implications:</bold> The conclusion is concise and well-balanced. However, given that this study leverages the national BPJS Prolanis dataset, the practical application could be stated more clearly. I suggest the authors add 1&#x2013;2 sentences addressing how this tool could be realistically integrated into Indonesia's primary healthcare digital ecosystem (e.g., integration into the BPJS P-Care system) to assist general practitioners in risk stratification.</p>
                    </list-item>
                </list>
            </p>
            <p>Is the work clearly and accurately presented and does it cite the current literature?</p>
            <p>Yes</p>
            <p>If applicable, is the statistical analysis and its interpretation appropriate?</p>
            <p>Yes</p>
            <p>Are all the source data underlying the results available to ensure full reproducibility?</p>
            <p>Yes</p>
            <p>Is the study design appropriate and is the work technically sound?</p>
            <p>Yes</p>
            <p>Are the conclusions drawn adequately supported by the results?</p>
            <p>Partly</p>
            <p>Are sufficient details of methods and analysis provided to allow replication by others?</p>
            <p>Yes</p>
            <p>Reviewer Expertise:</p>
            <p>"No competing interests were disclosed."</p>
            <p>I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.</p>
        </body>
        <sub-article article-type="response" id="comment16443-483733">
            <front-stub>
                <contrib-group>
                    <contrib contrib-type="author">
                        <name>
                            <surname>hasan</surname>
                            <given-names>faizul</given-names>
                        </name>
                        <aff>faculty of nursing, Chulalongkorn University, Bangkok, Bangkok, Thailand</aff>
                    </contrib>
                </contrib-group>
                <author-notes>
                    <fn fn-type="conflict">
                        <p>
                            <bold>Competing interests: </bold>none</p>
                    </fn>
                </author-notes>
                <pub-date pub-type="epub">
                    <day>16</day>
                    <month>6</month>
                    <year>2026</year>
                </pub-date>
            </front-stub>
            <body>
                <p>Introduction (Citation Formatting): There are several typographical errors&#x00a0;regarding&#x00a0;text citations in the Introduction section. For instance, in paragraph 2, "3.14 Hyperglycemia triggers..." and "14.15 Well-defined risk factors..." appear to be&#x00a0;misformatted&#x00a0;reference numbers. Similarly, in paragraph 4, "...from 0.80 to 0.90.19.20" fuses the text with reference numbers 19 and 20. Please thoroughly revise the manuscript to ensure all citations strictly adhere to the journal's formatting guidelines.&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for your comment. We have rechecked all references throughout the entire manuscript. As per F1000Research guidelines, references follow a free format but must be consistent across the manuscript. In our study, we have adopted the AMA reference style.&#x00a0;</p>
                <p> Abstract (Methodology): To improve clarity for a broader audience, please consider adding a brief statement&#x00a0;regarding&#x00a0;the data-splitting ratio (e.g., 80% training and 20% testing) in the Methods subsection of the Abstract, if the word count permits.&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for this constructive suggestion. We have revised the Methods subsection of the Abstract to include the data-splitting ratio, as word count permits. The updated sentence now reads:&#x00a0;</p>
                <p> "Six algorithms (Logistic Regression, Random Forest, Decision Tree,&#x00a0;XGBoost,&#x00a0;LightGBM,&#x00a0;CatBoost) were trained on 80% of the data and internally validated on the remaining 20% to predict CKD."&#x00a0;</p>
                <p> Discussion/Conclusion Alignment: The authors correctly acknowledged the moderate recall (0.525) due to the imbalanced nature of the dataset.&#x00a0;I highly recommend ensuring that the text in the Conclusion explicitly reinforces that this web-based calculator is intended as a preliminary screening tool to guide clinical suspicion rather than a definitive diagnostic instrument. Consistent terminology should also be checked (e.g., matching "ischaemic" vs "ischemic").&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for this important recommendation. We have made the following revisions:&#x00a0;</p>
                <p> Conclusion section: We have explicitly revised the text to state that the web-based calculator is intended as a preliminary screening tool to guide clinical suspicion, not as a definitive diagnostic instrument. The revised conclusion now reads:&#x00a0;</p>
                <p> &#x201c;The model's moderate recall indicates potential for enhancement, yet its high precision and interpretability highlight its utility as a preliminary screening tool to inform clinical suspicion and identify high-risk individuals for targeted preventive interventions, rather than serving as an independent diagnostic instrument&#x201d;&#x00a0;</p>
                <p> Terminology consistency: We have reviewed the entire manuscript and standardized all spelling to "ischaemic" (British&#x00a0;English) throughout, ensuring consistency across all sections, including the Discussion,&#x00a0;and&#x00a0;Conclusion.&#x00a0;</p>
                <p> Methodology and Results Comments:&#x00a0;</p>
                <p> Methodology (Participant Flowchart): While the inclusion and exclusion criteria are explicitly defined, the manuscript lacks a participant flow diagram. To adhere strictly to the TRIPOD statement referenced by the authors, please provide a flowchart illustrating how the&#x00a0;initial&#x00a0;cohort was filtered down to the final sample size of 7,581 patients (e.g., numbers of excluded patients due to prior CKD or missing data).&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for your comment. We agree that a participant flow diagram is essential for TRIPOD compliance. As you noted, we have now provided this flowchart as Figure 1 in the revised manuscript, illustrating the stepwise filtering of the&#x00a0;initial&#x00a0;cohort down to the final sample of 7,581 patients, including numbers excluded due to prior CKD and missing data.&#x00a0;</p>
                <p> Table 2 Title Typo: The title of Table 2 reads "Six nearest neighbor algorithms used in machine learning." This is factually incorrect as the models evaluated (CatBoost,&#x00a0;XGBoost, Random Forest, etc.) are ensemble tree-based methods and logistic regression, not K-Nearest Neighbor algorithms. Please revise the title to accurately reflect the contents (e.g., "Performance comparison of the six machine learning models").&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for catching this error. We have revised the title of Table 2 accordingly. The corrected title now reads:&#x00a0;</p>
                <p> "Performance comparison of the six machine learning models"&#x00a0;</p>
                <p> Results (Table 1 Reconstruction): In the statistical analysis section, you mentioned comparing baseline parameters between individuals who developed CKD and those who did not&#x00a0;using&#x00a0;t-tests and chi-square tests. However, Table 1 only displays the aggregate characteristics of the total population. Please reconstruct Table 1 to separate the data into "CKD" and "Non-CKD" groups and include the corresponding p-values to show baseline statistical significance.&#x00a0;</p>
                <p> Response:&#x00a0;We have reconstructed Table 1 to present baseline characteristics separately for CKD (n=864) and non-CKD (n=6,717) groups, with corresponding p-values from t-tests and chi-square tests. The revised table is included in the manuscript.&#x00a0;</p>
                <p> Inconsistent AUC Values: There is a minor inconsistency in how AUC values are reported. For example,&#x00a0;CatBoost's&#x00a0;AUC is reported as 0.847373 in Table 2, 0.847 in the main text, and 0.85 in Figure 1's legend. Please ensure consistency in decimal precision across the text, tables, and figures (3 decimal places&#x00a0;is&#x00a0;recommended, e.g., 0.847).&#x00a0;</p>
                <p> Response:&#x00a0;Thank you for&#x00a0;noting&#x00a0;this inconsistency. We have standardized all AUC values to 3 decimal places (e.g., 0.847) throughout the manuscript, including the main text, Table 2, and Figure 1 legend.&#x00a0;</p>
                <p> Discussion and Conclusion Comments for the Authors:&#x00a0;</p>
                <p> Discussion (Typographical and Formatting Errors): There are a few minor formatting issues in the Discussion section. In paragraph 2 (Page 7), an erratic quotation mark appears inside the sentence: "...conventional logistic regression."&#x00a0;A study&#x00a0;by...". Additionally, on Page 9, a broken reference format is visible: "27- Likewise, the protective influence...".&#x00a0;Please meticulously&#x00a0;proofread the section to fix these citation alignment errors.&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for this note. We have corrected both the erratic quotation mark on Page 7 and the broken reference format on Page 9. The entire Discussion section has been carefully proofread and revised to ensure all citation formatting is&#x00a0;accurate&#x00a0;and aligned.&#x00a0;</p>
                <p> Discussion (Addressing Imbalanced Data): The authors provided an excellent and honest justification&#x00a0;regarding&#x00a0;the moderate recall due to the imbalanced nature of the dataset (11.4% CKD prevalence). However, it would strengthen the paper if the authors briefly suggested future technical workarounds in the limitation paragraph&#x2014;such as&#x00a0;utilizing&#x00a0;oversampling techniques (e.g., SMOTE) or adjusting algorithmic class weights to improve sensitivity in future iterations.&#x00a0;</p>
                <p> Response:&#x00a0;We thank the reviewer for the positive feedback on our data justification. We agree that highlighting technical workarounds for future research enhances the manuscript.&#x00a0;Accordingly, we have revised the limitations paragraph in the Discussion section to explicitly mention the potential use of oversampling techniques (such as SMOTE) and the adjustment of algorithmic class weights to&#x00a0;optimize&#x00a0;sensitivity/recall in future iterations.&#x00a0;</p>
                <p> "...&#x00a0;The moderate recall noted in this study&#x00a0;indicates&#x00a0;the intrinsically uneven character of the dataset, considering the 11.4% prevalence of CKD. To address this in future iterations, technical solutions such as employing sophisticated oversampling methods (e.g., Synthetic Minority Over-sampling Technique, SMOTE) or systematically modifying algorithmic class weights could be investigated to enhance model sensitivity."&#x00a0;</p>
                <p> Conclusions &amp; Practical Implications: The conclusion is concise and well-balanced. However, given that this study&#x00a0;leverages&#x00a0;the national BPJS&#x00a0;Prolanis&#x00a0;dataset, the practical application could be&#x00a0;stated&#x00a0;more clearly. I suggest the authors add 1&#x2013;2 sentences addressing how this tool could be realistically integrated into Indonesia's primary healthcare digital ecosystem (e.g., integration into the BPJS P-Care system) to&#x00a0;assist&#x00a0;general practitioners in risk stratification.&#x00a0;</p>
                <p> Response:&#x00a0;We thank the reviewer for this excellent and practical recommendation. We agree that contextualizing the deployment pathway within Indonesia's existing digital health infrastructure significantly strengthens the clinical utility of the study.&#x00a0;Accordingly, we have updated the final paragraph of the manuscript to explicitly propose the integration of our&#x00a0;CatBoost-based predictive tool into the national BPJS P-Care platform to aid primary care physicians in automated risk stratification.&#x00a0;</p>
                <p> &#x201c;This web-based calculator is ideal for Indonesia's primary care digital environment as a preliminary screening tool. The&#x00a0;CatBoost&#x00a0;algorithm embedded directly into the national BPJS P-Care system would allow general practitioners to perform real-time risk stratification during routine visits and provide&#x00a0;timely&#x00a0;nephroprotective interventions for high-risk type 2 diabetes patients in the&#x00a0;Prolanis&#x00a0;program.&#x201d;&#x00a0;</p>
                <p> Is the work clearly and accurately&#x00a0;presented&#x00a0;and does it cite the current literature?&#x00a0;</p>
                <p> Yes&#x00a0;</p>
                <p> Is the study design&#x00a0;appropriate&#x00a0;and is the work technically sound?&#x00a0;</p>
                <p> Yes&#x00a0;</p>
                <p> Are sufficient details of methods and analysis provided to allow replication by others?&#x00a0;</p>
                <p> Yes&#x00a0;</p>
                <p> If applicable, is the statistical analysis and its interpretation&#x00a0;appropriate?&#x00a0;</p>
                <p> Yes&#x00a0;</p>
                <p> Are&#x00a0;all the source data underlying the results available to ensure full reproducibility?&#x00a0;</p>
                <p> Yes&#x00a0;</p>
                <p> Are the conclusions drawn adequately supported by the results?&#x00a0;</p>
                <p> Partly&#x00a0;</p>
                <p> Response:&#x00a0;Thank&#x00a0;you for your acknowledgement and consideration.</p>
            </body>
        </sub-article>
    </sub-article>
</article>
