Keywords
clinical decision support systems, chronic kidney disease, medication safety, renal dose adjustment, artificial intelligence, randomized controlled trials
This article is included in the Artificial Intelligence and Machine Learning gateway.
This article is included in the AI in Medicine and Healthcare collection.
Optimization of renal drug dosing to avoid drug toxicity is essential in Chronic Kidney Disease (CKD), yet prescribing errors are common. CDSS with rule-based and AI/ML based tools are used to address this safety gap; however, their impact remains uncertain.
We performed a PRISMA-guided systematic review and meta-analysis of RCTs comparing rule-based or AI/ML CDSS with usual care comparators among adults with CKD or at risk of CKD-related prescribing errors. The primary outcome was a medication safety endpoint aligned with the CDSS logic (appropriate renal dosing, potentially inappropriate prescribing, and medication errors). Secondary outcomes were quality-of-care processes, clinical endpoints, use of health services, and patient-reported outcomes. To address heterogeneity, we supplemented meta-analysis with a structured Best Evidence Synthesis and trial-level mapping by delivery mode and workflow stage.
Among the 20 RCTs that met our inclusion criteria, 6 provided data for the meta-analysis. CDSS improved proximal medication-safety processes (RR 1.76; 95% CI, 1.13-2.74). The wide prediction interval indicates that effectiveness depends on implementation and local settings. Documentation of CKD in electronic health records improved (risk ratio 1.19; 95% confidence interval 1.07–1.32), but downstream clinical outcomes were less studied and remain equivocal. Interventions with current evidence were predominantly interruptive, order-entry interventions. Implementation barriers were common; clinician compliance ranged from 17% to 74% due to alert fatigue, time constraints, and unclear understanding of system function and override processes.
CDSS for CKD have shown value in enhancing medication safety, but not all models have been successful. These process-based benefits are not yet supported by demonstrable improvements in clinical outcomes. This gap supports treating renal CDSS as part of safety-critical services, requiring auditable logic, clear severity tiers, and a traceable mechanism for overrides. A replicable evidence base requires agreed core outcomes and reporting standards.
clinical decision support systems, chronic kidney disease, medication safety, renal dose adjustment, artificial intelligence, randomized controlled trials
Chronic kidney disease (CKD) is an important global health issue leading to high incidence of morbidity, mortality, and utilization of healthcare resources. The complex and often multi-morbidity burden with which CKD patients present, combined with the polypharmacy regimens that they are commonly exposed to, may render them vulnerable to harm from medications. Renal function–based medication prescribing, periodic checks, and timely treatment changes are required to ensure optimal management, prevent adverse outcomes, and delay disease progression.1–6 In practice, however, decision-making on dosing and monitoring may occur under extreme time pressure, and clinicians may struggle with the increasing complexity of electronic health records (EHRs) that obscure key laboratory, diagnostic, and medication information at the point of care.7,8 In clinical nephrology, these limitations are further exacerbated in nephrology practice, especially in patients receiving renally cleared medications, experiencing nephrotoxic exposures, and receiving antihypertensive treatments. In such high-risk environments, even small errors in prescribing can lead to an acute kidney injury or drive unnecessary disease progression. These phenomena perpetuate a persistent gap between evidence-based recommendations and actual care, particularly for high-risk prescribing decisions, and identify medication safety in CKD as a priority domain for quality improvement initiatives/programs and patient safety efforts.
Clinical decision support systems (CDSS) have emerged as a key strategy to close this gap.9–13 Broadly, current CDSS follow two trajectories. Rule-based systems encode guideline logic and expert rules into deterministic alerts or order sets, typically triggered by thresholds in kidney function, prescribed drugs, or other structured data.14,15 More recently, artificial intelligence–enabled CDSS (AI-CDSS) have been developed that use longitudinal data and machine-learning models to generate individualized risk predictions or treatment recommendations.16–18 Both approaches aim to standardize care processes, support safer prescribing, and reduce preventable harm without unduly increasing clinicians’ cognitive workload. From a translational perspective, renal CDSS operationalize evidence into real-time decisions at the point of care. However, in practice, CDSS activities are deployed on a variety of platforms (e.g., EHR/CPOE-integrated, Web-based, mobile or standalone tools) and embedded within different workflow stages (e.g., prescribing vs monitoring) which may have important implications for uptake, usability, and effectiveness. Therefore, by investigating how trials correspond to implementation configurations and outcome categories we can also seek to understand where is the evidence base well-developed and where downstream evaluation remains weak.
While more widely used, the overall effect of CDSS on the quality and safety of care in CKD remains unclear. Prior reviews of CDSS have often pooled heterogeneous conditions and interventions, making it difficult to determine where CDSS provide the most excellent value for CKD specifically.19–22 Moreover, the majority of measurements are primarily based on proximal process measures (e.g., renal dosing, CKD documentation, or guideline adherence) and do not have a direct correlation with downstream patient-safety outcomes such as adverse drug events, hospitalisation, and progression to end-stage kidney disease.
Topics on how to practice and their relation to safety are inconsistently reported and inadequately aggregated.23–26 The primary obstacle to overcoming CDSS implementation in nephrology is no longer data capture but understanding which system designs will impact realistic clinical outcomes and antigens (i.e., both kidney disease progression and dialysis-related complications). Answering this question is important to avoid decision support tools remaining mere documentation tools. These evidence gaps indicate that using only meta-analytic summary estimates is inadequate for informing clinical policy or directing future research in renal drug safety. Instead, a linchpin is an overarching model that transparently relates the functionality of CDSS and workflow integration to actual care processes. The dissociation of well-controlled improvement in administrative notes and the widely varying dosing errors once again points to an entrenched environmental dependence. If this context-dependence is disregarded, there is a danger of turning towards statistical abstractions that only hide the actual determinants of success.
This review followed PRISMA guidelines (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and The Cochrane Handbook for Systematic Reviews and Meta-Analyses.27,28 This review was registered on the PROSPERO website (Registration No. CRD420251139903).
Systematic searches of PubMed, Scopus, ScienceDirect, and ProQuest were conducted for each database through August 2025. Medical Subject Headings (MeSH) and a combination of free-text words were used in the search, without any language limitation. The detailed search strategy is available as Extended data (Appendix 1: Search strategy).
Eligible studies were restricted to RCTs examining digital CDSS applied for clinical decision-making in CKD, including EHR/CPOE-interfaced solutions and web-based or mobile (standalone) electronic decision supports. The interventions might be assisted by AI/ML or rule-based. Trials had to have at least one process domain concerning medication safety. The primary outcome was appropriate dosing according to renal function, and secondary outcomes included CKD identification/documentation, death rate, renal function (as measured by serum creatinine/eGFR), BP/CV events and usability of the CDSS as well as implementation outcomes such as adherence to or compliance with recommendations made by the CDSS when reported. Non-randomized studies, non-electronic databases, and those not specifically for CKD were excluded. Details are available in Tables 1 and 2.
Data were imported into Covidence and deduplicated, where data were also screened (titles/abstracts) with full texts reviewed for eligibility. A standard Excel form was used to record study features (title/authors/year, setting/country, RCT design/participants), CDSS features (functionality/classifications/intervention(s)), comparator(s), outcomes (primary/secondary), data types, and arm-specific sample sizes/follow-up. Screening and data extraction were performed by two independent reviewers, with a third reviewer to resolve any disagreements.
Risk of bias across studies was assessed with Cochrane’s RoB 2 for the following five domains: randomization process, deviations from intended interventions, missing outcome data, measurement of outcome, and selection of reported results. Risk of bias (RoB) judgements were performed using the RoB 2 guidance: if all domains were at low risk, then the overall RoB was classified as low; if one or more domains were judged to be high, the overall judgement was high; otherwise, some concerns. Methodological quality was assessed by two reviewers working independently of one another, and disagreements were resolved by discussion or a third reviewer.
We did not formally assess publication bias (e.g., funnel plot–based methods) because the number of studies contributing to each synthesis was limited and the included trials were highly heterogeneous in interventions and outcomes.
Meta-analyses were performed using RevMan 5.2 (accessed September 2025) with fixed- and random-effects models. Dichotomous results were aggregated as risk ratios using the Mantel–Haenszel test; continuous ones were calculated as MD (or SMD). We used fixed effects for low heterogeneity (I2 ≤ 50% and conceptually homogeneous outcomes) and random effects models for high heterogeneity (I2 > 50%) or important design/setting variation. For uncertainty, 95% CIs were provided; for heterogeneity, I2 and Chi2. A 0.5 continuity correction was applied to zero cells when necessary, and multi-arm studies were combined as per the Cochrane Handbook to prevent double-counting. Due to the substantial heterogeneity we found in our main effects, we also presented pooled effect estimates with prediction intervals (PI) for a better appreciation of expected results across various clinical settings. As the pooled estimates themselves provide limited refinement of information, due to how implementation is nuanced and varied at each site, our meta-analysis was also given support with a structured Best Evidence Synthesis (BES) and trial-level mapping by delivery mode and workflow stage, incorporating TIDieR elements; trial network mapping was performed using Cytoscape (version 3.10.4).
Based on the developed search strategy, 1,288 articles were identified through systematic searching. Of these, after removing 173 duplicates, 396 articles were screened by title and abstract. Then, for reasons not meeting the inclusion criteria, 284 were excluded. We sought retrieval of 112 reports, of which 75 were not retrieved, leaving 37 reports assessed for eligibility. In total, 20 trials met our inclusion criteria, and 6 of them underwent quantitative analysis. Figure 1 depicts the process of literature retrieval.
The main characteristics of the included studies are summarized in Table 3. The CDSS interventions we examined were heterogeneous in terms of their technical platform (eg., combined EHR/CPOE systems to standalone applications for mobile platforms) and core functions (eg., renal dosing support or CKD documentation). To consider this diversity as the first major contributory source of heterogeneity, we sought to control for it upfront using an organized Best Evidence Synthesis (BES).
| Source (First author, Publication year) | Year | Country | Trial designs | Number of participants | Specific technology/platform used |
|---|---|---|---|---|---|
| Abdel-Kader, 201248 | 2011 | United States | Cluster randomized controlled trial | 248 | EpicCare (Epic Systems) |
| Alhodaib, 202037 | 2020 | United Kingdom | Randomized controlled trial (RCT) | 54 (junior doctor and specialist nurses) | Mobile app developed by Medic Genie |
| Awdishu, 201629 | 2016 | United States | Cluster randomized controlled trial | 1278 | Epic EHR |
| Bhardwaja, 201132 | 2011 | United States | Randomized controlled trial (RCT) | 32917 | Drug Renal Alert Pharmacy (DRAP) program |
| Carroll, 201842 | 2018 | United States | Cluster randomized controlled trial | 6699 | Three separate systems were used due to vendor issues; integrated with electronic health records (EHRs). |
| Chen, 202239 | 2022 | China | Randomized controlled trial (RCT) | 120 | Internet + H2H model |
| Erler, 201235 | 2012 | Germany | Cluster randomized controlled trial | 404 | DOSING software (standalone CD version) |
| Field, 200930 | 2009 | Canada | Cluster randomized controlled trial | 800 (residents) | Meditech MAGIC platform using Provider Order Management (POM4.9 upgraded to 5.5) |
| Januzzi, 202540 | 2025 | The study was conducted in 34 countries with significant European representation. | Randomized controlled trial (RCT) | 4401 | Machine learning-based risk prediction algorithm for diabetic kidney disease (DKD). |
| Lim, 202536 | 2025 | Taiwan | Randomized controlled trial (RCT) | 124 | The study involved four machine learning models: bagged Regression trees with random effects (REEM) trees, Mixed-effect random forest (MERF), Long short-term memory (LSTM) networks LSTM-I, and LSTM-II. |
| Locatelli, 200945 | 2009 | Bulgaria, Croatia, Germany, Italy, Latvia, Poland, Romania, Serbia and Montenegro | Cluster randomized controlled trial | 599 | Interactive centrally controlled database with secure Internet access case report form and a CDS system generating guideline-based management prompts. |
| Mancini, 200738 | 2007 | Italy | Crossover RCT | 55 | Dialog Advanced dialysis machine by Braun, integrated with bioLogic RR system. |
| Patzer, 201846 | 2018 | United States | Randomized controlled trial (RCT) | 443 | Mobile and web-based application |
| Peralta, 202044 | 2020 | United States | Cluster randomized controlled trial | 1819 | Electronic Health Record (EHR) |
| Peralta 202034 | 2020 | United States | Cluster randomized controlled trial | 524 | EpicCare electronic health record (EHR) system |
| Samal, 202243 | 2022 | United States | Randomized controlled trial (RCT) | 5590 | Web-based application with a single-page interface, web server, database, and application programming interface. |
| Samal, 202441 | 2024 | United States | Cluster randomized controlled trial | 2026 | Epic Systems software |
| Sequist, 201847 | 2018 | United States | Cluster randomized controlled trial | 7691 | Epic Systems EHR |
| Sperl-Hillen, 202333 | 2023 | United States | Cluster randomized controlled trial | 6420 | EpiCare electronic health record (EHR) system |
| Terrell 201031 | 2010 | United States | Randomized controlled trial (RCT) | 2783 | The AI-CDSS intervention is a computerized decision support system integrated into a computerized physician order entry (CPOE) system. |
To provide a view beyond common pooled estimates, we used a layered model for evidence synthesis. This comprised an Evidence Map ( Figure 4) to provide an overview of coverage across outcome domains, and a Best-Evidence Table ( Table 7) identifying robust patterns and areas of poor reporting within each configuration. Finally, we illustrated the ‘architecture’ of this evidence with a Network Map ( Figure 5) that outlines how details from particular trials and implementation configurations convert into outcome domains.
For the six studies included in the meta-analysis, the risk-of-bias assessment is shown in Figure 2. For the first outcome, 4 RCTs were generally at high risk or unclear in several domains (e.g. deviations from intended intervention, missing outcome data, selection of the reported result); underestimation of the outcome was largely low risk29–32; this corresponds with very high heterogeneity observed (I2 = 97%; pooled RR 1.76, 95% CI 1.13–2.74). For the second outcome, Peralta (2020) was low risk across all domains, whereas Sperl-Hillen (2023) identified several domains as high risk.33,34 Effect consistency in EHR-documented CKD banner was high (I2 = 0%) with precise estimates (fixed-effect: RR 1.20 [1.08–1.33]; random-effects: RR 1.19 [1.07–1.32]). Risk-of-bias diagrams are shown below ( Figure 2).

Each row represents a study and columns A–F represent domains (A, randomization process; B, deviations from intended interventions; C, missing outcome data; D, measurement of the outcome; E, selection of the reported result; F, overall bias), with blue = low risk (√), orange = some concerns (!), and vermilion = high risk (X).
3.4.1. Appropriate renal dosing
CDSS significantly improved appropriate renal dosing compared with standard care, with a relative risk (RR) of 1.76 (95% CI 1.13–2.74; Z = 2.51; p = .01). But there was considerable heterogeneity among the studies included, and I2 = 97%, τ2 = 0.19, χ2 = 72.66, df = 3, p < 0.00001. Therefore, a random-effects model was used to pool results across four RCTs.29–32
The effect of these interventions was highly varied; study-specific effects ranged from RR 1.20 to 3.00. This wide range illustrates how success is highly influenced by the clinical context and implementation approach. In absolute numbers, the CDSS arms reached appropriate dosage in 2,926 out of 5,565 medication instances-significantly more frequent than the same figure in control groups (2,230/6,588) ( Figure 3).

A. Appropriate renal-dosing action (per medication instance).
B. EHR-documented CKD recognition.
3.4.2. Prescribing and medication safety
Most studies show that an advantage works for rule-driven CDSS. In seven studies assessing for the dosage/prescription of medication, five were in favor of CDSS, including reductions in overdose (19.2% vs 34.5%, OR 0.45; p < 0.001),35 reductions in excessive dosage (43% vs 74%, p = 0.001),31 fewer medication errors (33% vs 49%, p < 0.001),32 various measures for proper dosage across many subcomponents (RR doses 0.95; Frequency 2.4; Avoidance 2.6; Information 1.8),30 and higher rates for suitable renal dosage (17% vs 5.7%, OR 1.89; 95% CI 1.45–2.47; p < 0.001).29 One study found no significant difference in prescription failure (1.26% vs 0.5%, p = 0.11),36 and another was neutral regarding workflow/guideline outcomes.37 Beyond dosage/prescription, one hemodialysis study reported reduced severe hypotension (8.3% vs 13.8%; p = 0.01),38 and one study of nutritional/biochemical changes showed improvements in parameters (albumin, prealbumin, hemoglobin, BUN), anthropometrics, quality of life, and satisfaction.39 One risk prediction study did not apply clinical implementation,40 while the other provided no outcome data or direction of effect. Overall, evidence in the domain of medication safety and appropriateness supports CDSS; however, effect sizes and outcome types differ across studies.
3.4.3. Best Evidence Synthesis (BES) by implementation configuration
Magnitude of effect was diverse, with study-specific RRs ranging from 1.20 to 3.00. While the summary of raw values points towards a benefit in the CDSS arms (2,926 doses correct out of 5,565 as compared to 2,230/6,588 in controls), heterogeneity is marked (I2 = 97%) and any one average effect is virtually meaningless. This initiates our attention for a Best Evidence Synthesis, to investigate why certain settings work better than others.
Our mapping of trials by delivery method and workflow timing reveals a lopsided evidence base. While there is no shortage of data on how interruptive alerts affect prescribing errors, the evidence regarding long-term clinical significance and implementation hurdles remains thin and varies wildly between settings (see Figure 4). This gap between process-level success and clinical impact is critical. We’ve summarized the specific signals and reporting deficiencies for each configuration in Table 7. Finally, Figure 5 gives a network-level perspective on the ‘architecture’ of the evidence base; we can see which trials support each configuration, and in what area of the evidence base fragility remains. A more detailed trial-by-trial mapping can be found in the Extended data (Appendix 2).

This map illustrates how reporting is distributed across three dimensions: (A) process, (B) clinical, and (C) implementation outcomes. The modes of delivery (vertical axis; e.g., active, interruptive alerts vs. passive dashboards) and timing in the workflow (horizontal axis; e.g., prescribing to avoid monitoring) can be grouped under this framework. Values within each tile indicate the percentage of trials (in that category) reporting a particular outcome. The density of evidence, on a scale of 0–5, is depicted as a color scale, with darker colors indicating greater accumulated evidence. The map reveals that, while process results are well reported, clinical and implementation outcomes further downstream are inconsistently collected. This discrepancy highlights a significant need for core outcome and KPI reporting to be standardised in future CDSS trials.

Network showing how each randomized and quasi-randomized study was linked to implementation strategies (delivery mode × workflow timing) and the outcome domains reported. Tested configurations nodes (blue) are connected to a single configuration node (red) representing the intervention’s delivery–timing and domain nodes (orange), reflecting whether the trial reported process, clinical, or implementation outcomes. This trial-based network architecture underscores that the evidence weight, in terms of structural debris, is aggregated at a relatively few configuration hubs, with links to central (clinical) nodes relying on a subset of trials—suggesting potential (evidence) over-biasedness and fragility with respect to clinical impacts.
3.5.1. EHR-documented CKD recognition
CDSS performed significantly better than standard care in identifying documented CKD in the EHRs (relative risk [RR] = 1.19, 95% confidence interval [CI] = 1.07 to 1.32; Z = 3.30, p = 0.001). There was no heterogeneity (I2 = 0%; τ2 = 0.00; χ2 = 0.53; df = 1; P = 0.47). Pooled data from two RCTs were analyzed using a random-effects model based on weighted inverse variance with restricted maximum likelihood (REML) estimation. The study-specific estimates were Peralta (2020)34 (RR = 1.11 [0.90–1.38]; weight, 24.5% [86/165 vs 88/188]) and Sperl-Hillen (2023)41 (RR = 1.22 [1.08–1.38]; weight, 75.5% [417/1568 vs 389/1783]). A total of 503 of 1733 events occurred in the CDSS arms vs. 477 of 1971 in the control arms. A summary of aggregated effects under fixed- and random-effects models is presented in Table 4.
3.5.2. Mortality events
Two studies provided mortality data. There was no difference in mortality between AI-CDSS and conventional care in our randomized hemodialysis trial (one case in each group). Other significant adverse events, such as hospital admission and blood transfusion, also showed similar results (7 versus 9 and 5 versus 5).36 Second, the other study did not assess the impact of the CDSS intervention in clinical practice but rather aimed to develop and validate a machine-learning-based risk algorithm. Mortality (renal and cardiovascular) was a non–treatment-related available future risk predictor, but there was no clinical practice evaluation of the CDSS,40 Consequently, it can be concluded that we were unable to assess intervention effect on mortality, despite its inclusion in that trial among the predicted composite outcomes assessed from treatment or control status; however, as alluded to earlier this means that study had to be excluded from the meta-analysis for subject outcome specifications on Mortality in Table 5.
| Study | Outcome category | AI-CDSS results | Comparator results | Effect direction |
|---|---|---|---|---|
| Lim, 202536 | Death events | No difference | Standard care | Neutral |
| Januzzi, 202540 | Cardiovascular/renal death | Model predicts, not tested in practice | Not applicable | Not applicable |
3.5.3. Renal function
The evidence is very suggestive of a beneficial effect of the intervention groups on general renal function and certain specific physiological outcomes. However, the findings show varied results across different types of outcomes. The benefits of the CDSS intervention were described in two studies. One trial that used a rule-based CDSS reported the estimated Glomerular Filtration Rate slope to decline less steeply when facilitation was added (−0.01 with CDSS plus facilitation vs − 0.95 with CDSS alone; p < 0.001; favors intervention),42 Another AI-CDSS study also showed an increase in renal cortical blood flow (p < 0.05; between groups).39 Three other studies were of neutral or no difference around hemoglobin control (within target with AI-CDSS versus physician management),36 CKD quality metrics,33 and laboratory monitoring.43 One study was for risk prediction model development, not clinical implementation, therefore presenting no comparative effect data; and the other did not report the data or direction for the outcome.40
3.5.4. Blood Pressure (BP) control and Cardiovascular (CV) outcomes
The evidence for the potential utility of Clinical Decision Support Systems (CDSS) across a range of CV and BP-related outcomes is cumulative. At the same time, results vary across BP measures in the included studies—mixed directionality of effect. Five studies showed mixed effects: three favoured the CDSS and two were neutral. Regarding blood pressure-related outcomes (change or control; three studies), one showed a decrease in systolic blood pressure (SBP) that was significantly greater among the CDSS arm as compared to the standard care group (-14.6 mmHg for CDSS vs -11.7 mmHg for control; p = 0.005),41 whereas two others did not find differences in terms of either blood pressure control42 or change.44 Furthermore, a single study reported improved quality of life (favoring AI-CDSS),39 and another study reported a decreased rate of severe intradialytic hypotension (also favoring the AI-CDSS).38 Quantitative effect sizes and p-values were reported exclusively in one study; the other four reported only the direction of effect without providing numerical values.
3.5.5. Usability and provider acceptance
Clinicians’ involvement in CDSS was highly heterogeneous in this context. Provider-level adherence in five studies with quantitative data ranged from 17% to 74%.29,31,34,45 Only two studies included qualitative terms to define engagement (low use and high compliance)43,46 while four studies reported no adherence outcomes.32,33,36,37 User satisfaction was quantitatively evaluated in two studies and showed low or no burden, with high satisfaction rates of 74% and 82% among users,37,44 respectively. Barriers for (non-)implementation and use were reported in eight studies, among others, alert fatigue, lack of training, time pressures, unfamiliarity with CDSs or resistance to change, technical problems like application crashes, disabling printing, economic reasons, disturbances relating to COVID-19, and overall burden of users on time.29,31,33,36,37,43,45,46 Regarding the facilitators, they were cited in two studies that reported integration of their results and a reduction in false positives.32,34 Sustainability was reported in three papers: two reported plans to continue or increase use, and one found sustained error reduction.31,32,46 In general, clinician acceptance of CDSS was moderate and varied considerably; there is potential to further support implementation by integrating systems to improve CDSS adoption.
3.5.6. Guideline adherence and clinical process measures
For guideline adherence and process measures, two studies favored AI-CDSS implementation, two were neutral, and one reported no effect direction. AI-CDSS was linked to increased ACEi/ARB utilization and nephrology referral, and decreased medication errors.32,47 In contrast, adherence to ESA was not inferior to usual care or low, which probably reflects human/systemic barriers.36,45 Heterogeneity of outcomes prevented pooling; overall, benefits seem to be aimed at prescribing/referral and medication security, with adherence possibly requiring more than decision support ( Table 6).
| Study | Outcome | AI-CDSS results | Comparator | Effect direction |
|---|---|---|---|---|
| Lim, 202536 | ESA guideline adherence | Non-inferior | Standard care | Neutral |
| Sequist, 201847 | ACEi/ARB, nephrology referral | Improved in high-risk | Standard care | Favors CDSS |
| Bhardwaja, 201132 | Medication errors | Reduced | Standard care | Favors CDSS |
| Locatelli, 200945 | Adherence | Low even with CDSS | Standard care | Neutral |
This systematic review of 20 RCTs demonstrates that renal-CDSS principally enhance proximal medication safety; however, their impact on downstream clinical outcomes is unclear.29–48 Most interventions were guideline-based rule systems incorporated into electronic prescribing or EHRs, with a minority being AI-powered tools; compared with control conditions across over 74,000 patients and 854 health professionals, CDSS uniformly changed prescribing behaviour towards safety (not clearly reducing mortality or progression to kidney failure). Because the implementation of the trials was so varied, in each case, we felt a pooled estimate would disguise as much as it would reveal. We thus supplemented our quantitative data with a structure-based synthesis and granular, trial-by-trial mapping ( Figures 4 and 5, Table 7, and Table S2). This framing takes the discussion beyond a dichotomous question of whether CDSS ‘works’. It reveals the real underlying structure of the research: where exactly the evidence lies in relation to different delivery modes and workflow stages — and, crucially, where, down the line, clinician evaluations are missing.
| Delivery mode | Workflow timing | Trials (n) | Included trials (first author, year) | Dominant function | Primary process target(s) (examples) | Process signal (trial-level) | Clinical endpoints evaluated (availability; summary) | Implementation measurement (availability; key types) | Recurrent missing elements (to enable future pooling) |
|---|---|---|---|---|---|---|---|---|---|
| Active interruptive alerts | Prescribing/order entry | 4 | Awdishu 201529; Samal 202441; Sperl-Hillen 202433; Terrell 201031 | Dosing | Renal dosing/appropriate prescribing | Favors 3/4; Neutral 1/4 | 1/4; mixed | 4/4; adoption/use, adherence, alert burden, overrides | governance/update; severity tiering; ownership |
| Active interruptive alerts | Other/unclear | 1 | Bhardwaja 201132 | Dosing | Medication errors; false positives | Favors 1/1 | 0/1; not evaluated | 1/1; adoption/use, adherence, burden/satisfaction | severity tiering; override reasons; governance detail |
| Passive dashboard/prompts | Prescribing/order entry | 1 | Abdel-Kader 201148 | Recognition | Referral/testing prompts | Mixed 1/1 | 0/1; not evaluated | 0/1; NR | implementation descriptors variably NR (see Table S1) |
| Standalone/mobile decision aid | Prescribing/order entry | 1 | Erler 201235 | Dosing | Renal dosing exceedance | Favors 1/1 | 0/1; not evaluated | 0/1; NR | alert modality; trigger logic; denominators |
| Standalone/mobile decision aid | Monitoring/follow-up | 1 | Locatelli 200945 | Monitoring | Hb/ferritin thresholds | Neutral 1/1 | 0/1; not evaluated | 1/1; adoption/use, alert burden, satisfaction | alert modality; trigger logic; rule transparency |
| Standalone/mobile decision aid | Other/unclear | 1 | Alhodaib 202037 | Monitoring | Decision accuracy score | Neutral 1/1 | 0/1; not evaluated | 1/1; adoption/use, burden/satisfaction | workflow efficiency NR; modality/triggering; integration detail |
| Unclear/NR | Prescribing/order entry | 3 | Field 200930; Lim 202536; Sequist 201847 | Dosing/Monitoring | Appropriate orders/targets | Favors 1/3; Mixed 2/3 | 1/3; no difference | 3/3; adoption/use, burden, overrides | modality classification; severity tiering; interface/trigger detail |
| Unclear/NR | Monitoring/follow-up | 3 | Chen 202239; Mancini 200738; Peralta 202044 | Monitoring/Recognition | Monitoring risks (e.g., hypotension/nutrition/CKD recognition) | Favors 3/3 | 3/3; mixed | 2/3; adoption/use, satisfaction | reporting gaps vary (see Table S1) |
| Unclear/NR | Other/unclear | 5 | Carroll 201842; Januzzi 202540; Patzer 201846; Peralta 202034; Samal 202243 | Recognition/Risk prediction | CKD recognition/awareness; risk-tool labs | Favors 1/5; Neutral 1/5; Mixed/Worse/NR remaining | 3/5; mixed/NR/no difference | 3/5; adoption/use, burden/satisfaction | modality; trigger logic; required inputs |
The most consistent benefit to patient safety is better renal-dose appropriateness and its prescribing. Studies that included real-time alerts or order-entry checks resulted in higher doses based on kidney function, decreased use of contraindicated drugs, and reduced excessive dosing of renally eliminated drugs.29–32,35,38,39 As these errors represent a standard route to drug-induced AKI and other serious events, such process gains are clinically relevant even when not all of the adjudicated ADEs have been captured. The substantial heterogeneity in early dosing trials, however, suggests that impact depends heavily on local thresholds, targeted drug classes, degree of EHR integration, and provider response to prompts.29–32,35 Consistent with this, our configuration-based Evidence Map ( Figure 4) indicates that the densest evidence for process improvement clusters in interruptive, order-entry implementations, whereas clinical outcomes are reported far less consistently across configurations.
By contrast, effects on renal function, cardiovascular events, and survival were inconsistent. A small number combined a CDSS with other interventions, such as practice facilitation or intensive management, and found positive eGFR trajectories and improved renal cortical blood flow, or less severe intradialytic hypotension, but negative hemoglobin control, CKD quality measures, and survival.33,36,38–40,42–44,48 This discrepancy between relatively obvious benefits in proximal safety process measures and neutral downstream outcomes is not surprising given that trials were powered on process endpoints; the follow up periods were short; the CKD populations studied were heterogeneous with co-morbidities and competing risks; and those running concurrently other quality initiatives (even significant reductions in dosing error rates might equate to relatively minor absolute alterations in hard outcomes).
Figure 5 further confirms the fragility of the evidence base. By visualizing the network at the trial level, we observe that clinical-stage testing is restricted to a substantially smaller number of configurations than the raw trial counts might imply. This raises a Very Big Risk: it is the nature of our beast that we aggregate patient-level data from a literature dominated by process-level data, and we do so with extreme caution.
CDSS also slightly increased EHR-documented CKD recognition in trials that focused on problem-list documentation and risk stratification; being labeled appropriately is a condition precedent to activating renal-dosing rules, monitoring protocols, or nephrology referral prompts.33,34,42,44,48 The majority of studies considered transparent rule systems that capture dosing thresholds or contraindication-related information, whereas AI-enabled tools focused on higher-level predictive tasks, such as optimizing erythropoiesis-stimulating agent dosing or estimating cardio-renal risk, rather than broader formulary-wide dose checking.29–32,35,36,38–40,42,48 There is randomised trial evidence that AI systems are no less effective than standard practice at focusing on process-level and physiological surrogates. To date, we have not seen any full publication of an RCT showing a clear benefit over a well-designed rule-based system in patient-centered safety. This is a crucial distinction. With this in mind, Table 7 reflects “evidence signals” that are configuration-specific, as well as ongoing reporting gaps that continue to enable us to ascertain whether a system’s success is due to its algorithm or to better workflow integration.
Implementation and usability results provide insight into why the safety potential of CDSS is incompletely realized. Adherence of providers to the CDSS’s advice varied from very low to moderate. Two main and disadvantageous themes (alert fatigue, time pressure) as well as seven other recurring patterns in the qualitative findings (lack of training, weak technical integration, limited use or coordination with pharmacists, limited involvement in nephrologist consultations) were identified.29,31–33,36,37,43–46 High rates of non-specific alarms can desensitize clinicians and add to alert-fatigue burden (e.g., hazard related to automation when critical warnings are missed), unclear attribution for responding to high-risk activations, and weak embedding within multidisciplinary care can further dampen effects on patient clinical outcomes.32,36,45,47 This study does not characterise CDSS as software. Still, as a socio-technical intervention, it draws attention to the fact that safety is as much about governance and workflow as it is about algorithmic accuracy. This transition in perspective is also why we deviated from treating transfer modes and workflow kinetics as frozen “background” variables and chose to treat them as dynamic, living entities. To put these findings into context, Extended data (Appendix 2: Trial-level mapping table) of the current study offers an in-depth examination of the trials included, while ‘seeing beyond the tell tale’. Deconstructing how this evidence was collected, we seek to distil lessons for what our ‘current’ evidence base can actually support–an exercise we see as important in the face of recent technological developments that could potentially render such data obsolete.
To health systems and quality-improvement leaders, none of this proves that renal-dose CDSS raises survival. Current evidence suggests treating the thing as safety-critical infrastructure whose primary value in tertiary analyses may be preventing avoidable prescribing errors such as overdose of renally cleared agents, failure to adjust dosing down with falling eGFR, lapse in initiation of ACEi/ARB therapy for most indications other than AKI with HFrEF, or lag in being referred to nephrology.29–32,35,38,39,45,47 Deployment should accompany medication safety initiatives in programs, including pharmacist-led review of high-risk prescriptions, standardized CKD order sets, and promoting drug-kidney education. A few simple metrics, the proportion of renally inappropriate orders, patterns of overrides for high-risk alerts, and serious medication-related events can recalibrate thresholds, eliminate low-value rules, and pinpoint services calling for support. In contrast, alert design should focus on high-risk scenarios with straightforward recommendations to incorporate into the workflow of ordering.29,31–33,36,37,39,40,42–46,48 Our configuration-based synthesis further suggests that implementation decisions (e.g., interruptive alerts at order entry vs non-interruptive dashboards) should be considered part of the intervention itself, because they condition adoption and the plausibility of achieving clinically meaningful downstream impact.
Three guiding considerations arise from the existing evidence.29–32,35,36,38,40,42,48 Pragmatic (sufficiently powered, head-to-head RCT) comparisons between AI-CDSS and mature rule-based systems are required to establish whether advanced prediction and personalization provide incremental safety or cost-effectiveness gains beyond those achieved with effectively implemented rules. Second, a CKD-CDSS core outcome set, which includes measurement of renal-dose accuracy: EHR-documented CKD recognition; nephrology referral; blood pressure control; and important patient safety outcomes such as acute kidney injury (AKI), serious ADRs, or MHRs, improves comparability of studies. Third, new trials should adhere to such frameworks as CONSORT-AI for designing more complex interventions that establish implementation parameters as primary outcomes driven by mediation analyses linking prescription-process gains to clinical endpoints, with consideration of transparency, bias (eg, endogenously impacting trained vs nontrained sites), data drift crossover, and Equity in AI-CDSS models. To address the recurring gaps in the evidence observed ( Table 7, Extended data in Appendix 2: Trial-level mapping table), future trials may need to include better reporting of outcomes. This encompasses detailed trigger logic, the alerting delivery mechanism, and levels of severity, as well as a mindset toward the reporting of override data. And indeed, the slogan that ownership lineage and governance for model updates are a substitute for what is wanted has proven to be very true. These are the factors that determine whether a study’s data can contribute to interpretation or be included in subsequent meta-analyses.
This systematic review was reported according to the PRISMA guidelines, was prospectively registered in PROSPERO, and included only randomized controlled trials, thereby improving internal validity compared with mixed-design reviews. When outcome-specific RoB 2 evaluations were applied and random-effects models with fixed-effect sensitivity analyses fitted, there was evidence that process outcomes differed from clinical endpoints and a physical apparent pattern in time; i.e., earlier dosing studies yielded larger, although more heterogeneous effects, while more recent EHR-integrated trials gave rise to smaller but less diverse gains in CKD awareness and associated processes. Another strength is the structured nature of the synthesis and the visualization of the architecture of the evidence at the trial level ( Figure 4 and Figure 5, Table 7). More than just creating identifiable pooled effects, these methods provide a nuanced tool for navigating research in which heterogeneity is so vast. They guard against the interpretive myopia that would allow our calculations to be reported but not properly taken into account in their specific context.
It is important limitations must also be recognized. There was significant heterogeneity across many of the most important outcomes, driven by differences in outcome definitions and trial design; only a proportion of eligible trials contributed to each meta-analysis. In particular settings, there were too few studies with effect estimates to allow informative subgroup analyses or exploration of publication bias. Key domains were at risk of “some concern” or “high” bias in several studies, and the majority of studies took place in high-income countries with well-established EHR systems. Few trials explicitly quantified adjudicated adverse drug events and other patient-centred safety measures, so estimates of harm reductions continue to be based largely on the premise that improvements in prescribing practice translate into less harm.29–32,35,38 We acknowledge, however, that the most consistent signal is around improved renal dosing and related prescribing practices. Finally, because trial reporting of configuration and implementation details was frequently incomplete, the Evidence Map and network visualization necessarily reflect what was reported; Table S2 is therefore provided to transparently document reporting presence/absence and to guide more reproducible future evaluations.
Rule-based and AI-driven CDSS have successfully augmented proximal drug safety in CKD by increasing the accuracy of dosing and prescribing practices. However, it has been challenging to demonstrate the link between these systems and long-term clinical endpoints, such as renal function, CVD outcomes, or overall mortality. One cannot escape one important conclusion from the hodgepodge of studies in the literature. There is no inherent property that determines whether a CDSS will be effective; instead, it is contingent on the manner in which that system interacts with clinical work processes. Such findings indicate a new paradigm, from CDSS as a complement to CDSS as a safety-critical infrastructure. The underpinning of safe systems that are both safe and trustworthy requires a requisite enabling environment, inclusive of regulation, a human-centered design approach, and adherence to long-term monitoring. The focus of science should thus shift to more systematic head-to-head comparisons of AI vs. traditional rule-based systems. Sufficient data quality that adds to global-synthesis projects implies high study power and full reporting transparency; these are fundamental hallmarks of good research, which should not have to be traded off for the “opportunity” to take part in larger studies.
During the preparation of this paper, the authors used AI-assisted language tools (ChatGPT and Grammarly) to help improve English grammar and phrasing in these typescripts. All content of the scientific manuscripts (study design, data extraction and analysis, interpretation, and conclusions) has been developed by the authors. No AI tool produced any original scientific content. The authors thoroughly examined all AI-generated text and fully accept responsibility for the final manuscript content.
No data associated with this article.
Zenodo: PRISMA 2020 Checklist v2 and PRISMA flow diagram, Appendix for “Improving Medication Safety in Chronic Kidney Disease Using Rule-Based and Artificial Intelligence–Based Clinical Decision Support Systems: A Systematic Review of Randomized Controlled Trials”. Working DOI: https://doi.org/10.5281/zenodo.18678563. License: CC0 1.0.49
Data screening, extraction, and manuscript preparation were performed using Microsoft Office (Word/Excel) via an official student-licensed subscription provided by Universitas Indonesia. Study screening and selection were managed in Covidence (trial version). Meta-analysis was conducted using RevMan 5.2 (trial version; accessed September 2025). Network mapping and visualization were performed using Cytoscape (version 3.10.4). Figure 4 was created using Canva under an education/school license.
Literature searches were conducted using Scopus, ScienceDirect, and ProQuest accessed through Universitas Indonesia’s institutional subscriptions. PubMed was accessed free of charge, and full-text access for paywalled articles was obtained where available through Universitas Indonesia’s institutional licenses.
This systematic review is reported in accordance with the PRISMA 2020 statement. The PRISMA flow diagram is presented in Figure 1. The completed PRISMA 2020 checklist is available as Extended data (https://doi.org/10.5281/zenodo.18678563).49
License: CC0 1.0.
| Views | Downloads | |
|---|---|---|
| F1000Research | - | - |
|
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)