ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Systematic Review

Improving Medication Safety in Chronic Kidney Disease Using Rule-Based and Artificial Intelligence–Based Clinical Decision Support Systems: A Systematic Review of Randomized Controlled Trials

[version 1; peer review: awaiting peer review]
PUBLISHED 16 Apr 2026
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Artificial Intelligence and Machine Learning gateway.

This article is included in the AI in Medicine and Healthcare collection.

Abstract

Background

Optimization of renal drug dosing to avoid drug toxicity is essential in Chronic Kidney Disease (CKD), yet prescribing errors are common. CDSS with rule-based and AI/ML based tools are used to address this safety gap; however, their impact remains uncertain.

Methods

We performed a PRISMA-guided systematic review and meta-analysis of RCTs comparing rule-based or AI/ML CDSS with usual care comparators among adults with CKD or at risk of CKD-related prescribing errors. The primary outcome was a medication safety endpoint aligned with the CDSS logic (appropriate renal dosing, potentially inappropriate prescribing, and medication errors). Secondary outcomes were quality-of-care processes, clinical endpoints, use of health services, and patient-reported outcomes. To address heterogeneity, we supplemented meta-analysis with a structured Best Evidence Synthesis and trial-level mapping by delivery mode and workflow stage.

Results

Among the 20 RCTs that met our inclusion criteria, 6 provided data for the meta-analysis. CDSS improved proximal medication-safety processes (RR 1.76; 95% CI, 1.13-2.74). The wide prediction interval indicates that effectiveness depends on implementation and local settings. Documentation of CKD in electronic health records improved (risk ratio 1.19; 95% confidence interval 1.07–1.32), but downstream clinical outcomes were less studied and remain equivocal. Interventions with current evidence were predominantly interruptive, order-entry interventions. Implementation barriers were common; clinician compliance ranged from 17% to 74% due to alert fatigue, time constraints, and unclear understanding of system function and override processes.

Conclusions

CDSS for CKD have shown value in enhancing medication safety, but not all models have been successful. These process-based benefits are not yet supported by demonstrable improvements in clinical outcomes. This gap supports treating renal CDSS as part of safety-critical services, requiring auditable logic, clear severity tiers, and a traceable mechanism for overrides. A replicable evidence base requires agreed core outcomes and reporting standards.

Keywords

clinical decision support systems, chronic kidney disease, medication safety, renal dose adjustment, artificial intelligence, randomized controlled trials

1. Introduction

Chronic kidney disease (CKD) is an important global health issue leading to high incidence of morbidity, mortality, and utilization of healthcare resources. The complex and often multi-morbidity burden with which CKD patients present, combined with the polypharmacy regimens that they are commonly exposed to, may render them vulnerable to harm from medications. Renal function–based medication prescribing, periodic checks, and timely treatment changes are required to ensure optimal management, prevent adverse outcomes, and delay disease progression.16 In practice, however, decision-making on dosing and monitoring may occur under extreme time pressure, and clinicians may struggle with the increasing complexity of electronic health records (EHRs) that obscure key laboratory, diagnostic, and medication information at the point of care.7,8 In clinical nephrology, these limitations are further exacerbated in nephrology practice, especially in patients receiving renally cleared medications, experiencing nephrotoxic exposures, and receiving antihypertensive treatments. In such high-risk environments, even small errors in prescribing can lead to an acute kidney injury or drive unnecessary disease progression. These phenomena perpetuate a persistent gap between evidence-based recommendations and actual care, particularly for high-risk prescribing decisions, and identify medication safety in CKD as a priority domain for quality improvement initiatives/programs and patient safety efforts.

Clinical decision support systems (CDSS) have emerged as a key strategy to close this gap.913 Broadly, current CDSS follow two trajectories. Rule-based systems encode guideline logic and expert rules into deterministic alerts or order sets, typically triggered by thresholds in kidney function, prescribed drugs, or other structured data.14,15 More recently, artificial intelligence–enabled CDSS (AI-CDSS) have been developed that use longitudinal data and machine-learning models to generate individualized risk predictions or treatment recommendations.1618 Both approaches aim to standardize care processes, support safer prescribing, and reduce preventable harm without unduly increasing clinicians’ cognitive workload. From a translational perspective, renal CDSS operationalize evidence into real-time decisions at the point of care. However, in practice, CDSS activities are deployed on a variety of platforms (e.g., EHR/CPOE-integrated, Web-based, mobile or standalone tools) and embedded within different workflow stages (e.g., prescribing vs monitoring) which may have important implications for uptake, usability, and effectiveness. Therefore, by investigating how trials correspond to implementation configurations and outcome categories we can also seek to understand where is the evidence base well-developed and where downstream evaluation remains weak.

While more widely used, the overall effect of CDSS on the quality and safety of care in CKD remains unclear. Prior reviews of CDSS have often pooled heterogeneous conditions and interventions, making it difficult to determine where CDSS provide the most excellent value for CKD specifically.1922 Moreover, the majority of measurements are primarily based on proximal process measures (e.g., renal dosing, CKD documentation, or guideline adherence) and do not have a direct correlation with downstream patient-safety outcomes such as adverse drug events, hospitalisation, and progression to end-stage kidney disease.

Topics on how to practice and their relation to safety are inconsistently reported and inadequately aggregated.2326 The primary obstacle to overcoming CDSS implementation in nephrology is no longer data capture but understanding which system designs will impact realistic clinical outcomes and antigens (i.e., both kidney disease progression and dialysis-related complications). Answering this question is important to avoid decision support tools remaining mere documentation tools. These evidence gaps indicate that using only meta-analytic summary estimates is inadequate for informing clinical policy or directing future research in renal drug safety. Instead, a linchpin is an overarching model that transparently relates the functionality of CDSS and workflow integration to actual care processes. The dissociation of well-controlled improvement in administrative notes and the widely varying dosing errors once again points to an entrenched environmental dependence. If this context-dependence is disregarded, there is a danger of turning towards statistical abstractions that only hide the actual determinants of success.

2. Methods

2.1. Study registration

This review followed PRISMA guidelines (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) and The Cochrane Handbook for Systematic Reviews and Meta-Analyses.27,28 This review was registered on the PROSPERO website (Registration No. CRD420251139903).

2.2. Search strategy and selection criteria

Systematic searches of PubMed, Scopus, ScienceDirect, and ProQuest were conducted for each database through August 2025. Medical Subject Headings (MeSH) and a combination of free-text words were used in the search, without any language limitation. The detailed search strategy is available as Extended data (Appendix 1: Search strategy).

Eligible studies were restricted to RCTs examining digital CDSS applied for clinical decision-making in CKD, including EHR/CPOE-interfaced solutions and web-based or mobile (standalone) electronic decision supports. The interventions might be assisted by AI/ML or rule-based. Trials had to have at least one process domain concerning medication safety. The primary outcome was appropriate dosing according to renal function, and secondary outcomes included CKD identification/documentation, death rate, renal function (as measured by serum creatinine/eGFR), BP/CV events and usability of the CDSS as well as implementation outcomes such as adherence to or compliance with recommendations made by the CDSS when reported. Non-randomized studies, non-electronic databases, and those not specifically for CKD were excluded. Details are available in Tables 1 and 2.

Table 1. Eligibility criteria (PICOS-aligned).

DomainInclusionKey exclusions
PopulationChronic Kidney Disease (CKD) cohorts in routine care (adult CKD stages as applicable).Pediatric-only studies (unless adults predominated).
InterventionDigital Clinical Decision Support Systems (CDSS) used for clinical decision-making (including EHR/CPOE-integrated, web-based, mobile, or standalone tools): AI/ML-based or rule-based. Must target ≥1 of: treatment advice; renal function–guided dosing; CKD identification/documentation; laboratory monitoring; nephrology referral; other management supports.Non-digital/paper-only decision aids.
ComparatorUsual/standard care (paper/EHR without CDSS) or alternative non-AI/rule configurations, as reported.
Outcomes (eligibility scope)At least one relevant process or clinical endpoint within the scope above.Protocols with no clinical outcomes; editorials/commentaries.
Study designRandomized Controlled Trials (individual, cluster, or crossover).Non-randomized/observational designs.

Table 2. Outcomes and operationalization.

Outcome category Operational definition (analysis unit)Effect metricNotes/handling
Primary: Appropriate renal dosingProportion of medication instances or patient–drug dispensings concordant with renal-function guidelines.Risk Ratio (RR)When only “inappropriate/excess dosing” or “medication error” was reported, appropriate counts were arithmetically derived without changing denominators.
CKD recognition (EHR)Presence of CKD diagnosis/problem list consistent with clinical criteria (patient-level).RRStructured EHR lists/codes.
MortalityAll-cause or specified mortality (patient-level).RR or HR (time-to-event)Prefer HR via generic inverse variance when available; do not pool HR with RR.
Renal functioneGFR level/slope, creatinine (patient-level).MD/SMDHarmonize units; prespecify directionality.
Blood pressure/CV eventsSBP/DBP; cardiovascular events (patient-level).MD/SMD; RRAs reported.
Usability/acceptabilitySUS scores, adoption/override, time-to-action (latency).MDReport implementation metrics narratively if heterogeneous.
Guideline adherenceACEi/ARB prescribing; nephrology referral; ESA adherence (patient-level).RRDefine denominator clearly (eligible patients/orders).

2.3. Data extraction

Data were imported into Covidence and deduplicated, where data were also screened (titles/abstracts) with full texts reviewed for eligibility. A standard Excel form was used to record study features (title/authors/year, setting/country, RCT design/participants), CDSS features (functionality/classifications/intervention(s)), comparator(s), outcomes (primary/secondary), data types, and arm-specific sample sizes/follow-up. Screening and data extraction were performed by two independent reviewers, with a third reviewer to resolve any disagreements.

2.4. Risk of bias in included studies

Risk of bias across studies was assessed with Cochrane’s RoB 2 for the following five domains: randomization process, deviations from intended interventions, missing outcome data, measurement of outcome, and selection of reported results. Risk of bias (RoB) judgements were performed using the RoB 2 guidance: if all domains were at low risk, then the overall RoB was classified as low; if one or more domains were judged to be high, the overall judgement was high; otherwise, some concerns. Methodological quality was assessed by two reviewers working independently of one another, and disagreements were resolved by discussion or a third reviewer.

2.5. Reporting bias assessment

We did not formally assess publication bias (e.g., funnel plot–based methods) because the number of studies contributing to each synthesis was limited and the included trials were highly heterogeneous in interventions and outcomes.

2.6. Synthesis methods

Meta-analyses were performed using RevMan 5.2 (accessed September 2025) with fixed- and random-effects models. Dichotomous results were aggregated as risk ratios using the Mantel–Haenszel test; continuous ones were calculated as MD (or SMD). We used fixed effects for low heterogeneity (I2 ≤ 50% and conceptually homogeneous outcomes) and random effects models for high heterogeneity (I2 > 50%) or important design/setting variation. For uncertainty, 95% CIs were provided; for heterogeneity, I2 and Chi2. A 0.5 continuity correction was applied to zero cells when necessary, and multi-arm studies were combined as per the Cochrane Handbook to prevent double-counting. Due to the substantial heterogeneity we found in our main effects, we also presented pooled effect estimates with prediction intervals (PI) for a better appreciation of expected results across various clinical settings. As the pooled estimates themselves provide limited refinement of information, due to how implementation is nuanced and varied at each site, our meta-analysis was also given support with a structured Best Evidence Synthesis (BES) and trial-level mapping by delivery mode and workflow stage, incorporating TIDieR elements; trial network mapping was performed using Cytoscape (version 3.10.4).

2.7. Certainty of evidence

We did not apply GRADE because outcomes and effect measures were highly heterogeneous and many trials did not report estimable effect sizes suitable for consistent certainty rating across outcomes.

3. Results

3.1. Study selection

Based on the developed search strategy, 1,288 articles were identified through systematic searching. Of these, after removing 173 duplicates, 396 articles were screened by title and abstract. Then, for reasons not meeting the inclusion criteria, 284 were excluded. We sought retrieval of 112 reports, of which 75 were not retrieved, leaving 37 reports assessed for eligibility. In total, 20 trials met our inclusion criteria, and 6 of them underwent quantitative analysis. Figure 1 depicts the process of literature retrieval.

6b5ef686-6963-483c-930c-7345c81e7d56_figure1.gif

Figure 1. PRISMA flow diagram of study identification, screening, eligibility assessment, and inclusion for randomized trials of CDSS in chronic kidney disease.

3.2. Study characteristics

The main characteristics of the included studies are summarized in Table 3. The CDSS interventions we examined were heterogeneous in terms of their technical platform (eg., combined EHR/CPOE systems to standalone applications for mobile platforms) and core functions (eg., renal dosing support or CKD documentation). To consider this diversity as the first major contributory source of heterogeneity, we sought to control for it upfront using an organized Best Evidence Synthesis (BES).

Table 3. Characteristics of the included studies.

Source (First author, Publication year)YearCountryTrial designsNumber of participantsSpecific technology/platform used
Abdel-Kader, 2012482011United StatesCluster randomized controlled trial248EpicCare (Epic Systems)
Alhodaib, 2020372020United KingdomRandomized controlled trial (RCT)54 (junior doctor and specialist nurses)Mobile app developed by Medic Genie
Awdishu, 2016292016United StatesCluster randomized controlled trial1278Epic EHR
Bhardwaja, 2011322011United StatesRandomized controlled trial (RCT)32917Drug Renal Alert Pharmacy (DRAP) program
Carroll, 2018422018United StatesCluster randomized controlled trial6699Three separate systems were used due to vendor issues; integrated with electronic health records (EHRs).
Chen, 2022392022ChinaRandomized controlled trial (RCT)120Internet + H2H model
Erler, 2012352012GermanyCluster randomized controlled trial404DOSING software (standalone CD version)
Field, 2009302009CanadaCluster randomized controlled trial800 (residents)Meditech MAGIC platform using Provider Order Management (POM4.9 upgraded to 5.5)
Januzzi, 2025402025The study was conducted in 34 countries with significant European representation.Randomized controlled trial (RCT)4401Machine learning-based risk prediction algorithm for diabetic kidney disease (DKD).
Lim, 2025362025TaiwanRandomized controlled trial (RCT)124The study involved four machine learning models: bagged Regression trees with random effects (REEM) trees, Mixed-effect random forest (MERF), Long short-term memory (LSTM) networks LSTM-I, and LSTM-II.
Locatelli, 2009452009Bulgaria, Croatia, Germany, Italy, Latvia, Poland, Romania, Serbia and MontenegroCluster randomized controlled trial599Interactive centrally controlled database with secure Internet access case report form and a CDS system generating guideline-based management prompts.
Mancini, 2007382007ItalyCrossover RCT55Dialog Advanced dialysis machine by Braun, integrated with bioLogic RR system.
Patzer, 2018462018United StatesRandomized controlled trial (RCT)443Mobile and web-based application
Peralta, 2020442020United StatesCluster randomized controlled trial1819Electronic Health Record (EHR)
Peralta 2020342020United StatesCluster randomized controlled trial524EpicCare electronic health record (EHR) system
Samal, 2022432022United StatesRandomized controlled trial (RCT)5590Web-based application with a single-page interface, web server, database, and application programming interface.
Samal, 2024412024United StatesCluster randomized controlled trial2026Epic Systems software
Sequist, 2018472018United StatesCluster randomized controlled trial7691Epic Systems EHR
Sperl-Hillen, 2023332023United StatesCluster randomized controlled trial6420EpiCare electronic health record (EHR) system
Terrell 2010312010United StatesRandomized controlled trial (RCT)2783The AI-CDSS intervention is a computerized decision support system integrated into a computerized physician order entry (CPOE) system.

To provide a view beyond common pooled estimates, we used a layered model for evidence synthesis. This comprised an Evidence Map ( Figure 4) to provide an overview of coverage across outcome domains, and a Best-Evidence Table ( Table 7) identifying robust patterns and areas of poor reporting within each configuration. Finally, we illustrated the ‘architecture’ of this evidence with a Network Map ( Figure 5) that outlines how details from particular trials and implementation configurations convert into outcome domains.

3.3. Risk of bias in included studies

For the six studies included in the meta-analysis, the risk-of-bias assessment is shown in Figure 2. For the first outcome, 4 RCTs were generally at high risk or unclear in several domains (e.g. deviations from intended intervention, missing outcome data, selection of the reported result); underestimation of the outcome was largely low risk2932; this corresponds with very high heterogeneity observed (I2 = 97%; pooled RR 1.76, 95% CI 1.13–2.74). For the second outcome, Peralta (2020) was low risk across all domains, whereas Sperl-Hillen (2023) identified several domains as high risk.33,34 Effect consistency in EHR-documented CKD banner was high (I2 = 0%) with precise estimates (fixed-effect: RR 1.20 [1.08–1.33]; random-effects: RR 1.19 [1.07–1.32]). Risk-of-bias diagrams are shown below ( Figure 2).

6b5ef686-6963-483c-930c-7345c81e7d56_figure2.gif

Figure 2. RoB 2 risk-of-bias assessment for the randomized controlled trials included in this review.

Each row represents a study and columns A–F represent domains (A, randomization process; B, deviations from intended interventions; C, missing outcome data; D, measurement of the outcome; E, selection of the reported result; F, overall bias), with blue = low risk (√), orange = some concerns (!), and vermilion = high risk (X).

3.4. Primary outcome

3.4.1. Appropriate renal dosing

CDSS significantly improved appropriate renal dosing compared with standard care, with a relative risk (RR) of 1.76 (95% CI 1.13–2.74; Z = 2.51; p = .01). But there was considerable heterogeneity among the studies included, and I2 = 97%, τ2 = 0.19, χ2 = 72.66, df = 3, p < 0.00001. Therefore, a random-effects model was used to pool results across four RCTs.2932

The effect of these interventions was highly varied; study-specific effects ranged from RR 1.20 to 3.00. This wide range illustrates how success is highly influenced by the clinical context and implementation approach. In absolute numbers, the CDSS arms reached appropriate dosage in 2,926 out of 5,565 medication instances-significantly more frequent than the same figure in control groups (2,230/6,588) ( Figure 3).

6b5ef686-6963-483c-930c-7345c81e7d56_figure3.gif

Figure 3. Forest plots of the effect of CDSS versus usual care on appropriate renal-dosing actions per medication instance (A) and on EHR-documented CKD recognition (B); squares represent study-specific risk ratios with 95% confidence intervals weighted by inverse variance, diamonds indicate the pooled random-effects estimates, and the vertical line at RR = 1.0 denotes no difference between groups.

A. Appropriate renal-dosing action (per medication instance).

B. EHR-documented CKD recognition.

3.4.2. Prescribing and medication safety

Most studies show that an advantage works for rule-driven CDSS. In seven studies assessing for the dosage/prescription of medication, five were in favor of CDSS, including reductions in overdose (19.2% vs 34.5%, OR 0.45; p < 0.001),35 reductions in excessive dosage (43% vs 74%, p = 0.001),31 fewer medication errors (33% vs 49%, p < 0.001),32 various measures for proper dosage across many subcomponents (RR doses 0.95; Frequency 2.4; Avoidance 2.6; Information 1.8),30 and higher rates for suitable renal dosage (17% vs 5.7%, OR 1.89; 95% CI 1.45–2.47; p < 0.001).29 One study found no significant difference in prescription failure (1.26% vs 0.5%, p = 0.11),36 and another was neutral regarding workflow/guideline outcomes.37 Beyond dosage/prescription, one hemodialysis study reported reduced severe hypotension (8.3% vs 13.8%; p = 0.01),38 and one study of nutritional/biochemical changes showed improvements in parameters (albumin, prealbumin, hemoglobin, BUN), anthropometrics, quality of life, and satisfaction.39 One risk prediction study did not apply clinical implementation,40 while the other provided no outcome data or direction of effect. Overall, evidence in the domain of medication safety and appropriateness supports CDSS; however, effect sizes and outcome types differ across studies.

3.4.3. Best Evidence Synthesis (BES) by implementation configuration

Magnitude of effect was diverse, with study-specific RRs ranging from 1.20 to 3.00. While the summary of raw values points towards a benefit in the CDSS arms (2,926 doses correct out of 5,565 as compared to 2,230/6,588 in controls), heterogeneity is marked (I2 = 97%) and any one average effect is virtually meaningless. This initiates our attention for a Best Evidence Synthesis, to investigate why certain settings work better than others.

Our mapping of trials by delivery method and workflow timing reveals a lopsided evidence base. While there is no shortage of data on how interruptive alerts affect prescribing errors, the evidence regarding long-term clinical significance and implementation hurdles remains thin and varies wildly between settings (see Figure 4). This gap between process-level success and clinical impact is critical. We’ve summarized the specific signals and reporting deficiencies for each configuration in Table 7. Finally, Figure 5 gives a network-level perspective on the ‘architecture’ of the evidence base; we can see which trials support each configuration, and in what area of the evidence base fragility remains. A more detailed trial-by-trial mapping can be found in the Extended data (Appendix 2).

6b5ef686-6963-483c-930c-7345c81e7d56_figure4.gif

Figure 4. CDSS implementation settings (Best evidence synthesis; Evidence map of outcome reporting).

This map illustrates how reporting is distributed across three dimensions: (A) process, (B) clinical, and (C) implementation outcomes. The modes of delivery (vertical axis; e.g., active, interruptive alerts vs. passive dashboards) and timing in the workflow (horizontal axis; e.g., prescribing to avoid monitoring) can be grouped under this framework. Values within each tile indicate the percentage of trials (in that category) reporting a particular outcome. The density of evidence, on a scale of 0–5, is depicted as a color scale, with darker colors indicating greater accumulated evidence. The map reveals that, while process results are well reported, clinical and implementation outcomes further downstream are inconsistently collected. This discrepancy highlights a significant need for core outcome and KPI reporting to be standardised in future CDSS trials.

6b5ef686-6963-483c-930c-7345c81e7d56_figure5.gif

Figure 5. Design of CKD CDSS evaluation trial architecture at the level of evidence: central configuration and domain connections.

Network showing how each randomized and quasi-randomized study was linked to implementation strategies (delivery mode × workflow timing) and the outcome domains reported. Tested configurations nodes (blue) are connected to a single configuration node (red) representing the intervention’s delivery–timing and domain nodes (orange), reflecting whether the trial reported process, clinical, or implementation outcomes. This trial-based network architecture underscores that the evidence weight, in terms of structural debris, is aggregated at a relatively few configuration hubs, with links to central (clinical) nodes relying on a subset of trials—suggesting potential (evidence) over-biasedness and fragility with respect to clinical impacts.

3.5. Secondary outcomes

3.5.1. EHR-documented CKD recognition

CDSS performed significantly better than standard care in identifying documented CKD in the EHRs (relative risk [RR] = 1.19, 95% confidence interval [CI] = 1.07 to 1.32; Z = 3.30, p = 0.001). There was no heterogeneity (I2 = 0%; τ2 = 0.00; χ2 = 0.53; df = 1; P = 0.47). Pooled data from two RCTs were analyzed using a random-effects model based on weighted inverse variance with restricted maximum likelihood (REML) estimation. The study-specific estimates were Peralta (2020)34 (RR = 1.11 [0.90–1.38]; weight, 24.5% [86/165 vs 88/188]) and Sperl-Hillen (2023)41 (RR = 1.22 [1.08–1.38]; weight, 75.5% [417/1568 vs 389/1783]). A total of 503 of 1733 events occurred in the CDSS arms vs. 477 of 1971 in the control arms. A summary of aggregated effects under fixed- and random-effects models is presented in Table 4.

Table 4. Summary of meta-analytic results (fixed-effect vs. random-effects).

OutcomeMethodEffect measurePooled effect (95% CI)Heterogeneity (Q; df; p; I2)τ2 (Estimator)95% Prediction intervalNotes
Appropriate renal dosingM-H, FixedRR1.40 (1.35–1.45)76.35; 3; <0.00001; 96%FE (fixed-effect) is shown as a comparator; assumes a single “true effect.”
Appropriate renal dosingIV, RandomRR1.76 (1.13–2.74)76.35; 3; <0.00001; 97%0.19 (REML)0.75–4.14Primary model; weights are more balanced across studies; confidence intervals are wider due to high heterogeneity.
EHR-documented CKD recognitionM-H, FixedRR1.19 (1.07–1.32)I2 = 0%Zero heterogeneity; FE ≈ RE (random-effects).
EHR-documented CKD recognitionIV, RandomRR1.19 (1.07–1.32)I2 = 0%≈0.00 (REML)[≈ same]Identical to FE when I2 = 0%.

3.5.2. Mortality events

Two studies provided mortality data. There was no difference in mortality between AI-CDSS and conventional care in our randomized hemodialysis trial (one case in each group). Other significant adverse events, such as hospital admission and blood transfusion, also showed similar results (7 versus 9 and 5 versus 5).36 Second, the other study did not assess the impact of the CDSS intervention in clinical practice but rather aimed to develop and validate a machine-learning-based risk algorithm. Mortality (renal and cardiovascular) was a non–treatment-related available future risk predictor, but there was no clinical practice evaluation of the CDSS,40 Consequently, it can be concluded that we were unable to assess intervention effect on mortality, despite its inclusion in that trial among the predicted composite outcomes assessed from treatment or control status; however, as alluded to earlier this means that study had to be excluded from the meta-analysis for subject outcome specifications on Mortality in Table 5.

Table 5. Mortality and major clinical events.

StudyOutcome categoryAI-CDSS resultsComparator resultsEffect direction
Lim, 202536Death eventsNo differenceStandard careNeutral
Januzzi, 202540Cardiovascular/renal deathModel predicts, not tested in practiceNot applicableNot applicable

3.5.3. Renal function

The evidence is very suggestive of a beneficial effect of the intervention groups on general renal function and certain specific physiological outcomes. However, the findings show varied results across different types of outcomes. The benefits of the CDSS intervention were described in two studies. One trial that used a rule-based CDSS reported the estimated Glomerular Filtration Rate slope to decline less steeply when facilitation was added (−0.01 with CDSS plus facilitation vs − 0.95 with CDSS alone; p < 0.001; favors intervention),42 Another AI-CDSS study also showed an increase in renal cortical blood flow (p < 0.05; between groups).39 Three other studies were of neutral or no difference around hemoglobin control (within target with AI-CDSS versus physician management),36 CKD quality metrics,33 and laboratory monitoring.43 One study was for risk prediction model development, not clinical implementation, therefore presenting no comparative effect data; and the other did not report the data or direction for the outcome.40

3.5.4. Blood Pressure (BP) control and Cardiovascular (CV) outcomes

The evidence for the potential utility of Clinical Decision Support Systems (CDSS) across a range of CV and BP-related outcomes is cumulative. At the same time, results vary across BP measures in the included studies—mixed directionality of effect. Five studies showed mixed effects: three favoured the CDSS and two were neutral. Regarding blood pressure-related outcomes (change or control; three studies), one showed a decrease in systolic blood pressure (SBP) that was significantly greater among the CDSS arm as compared to the standard care group (-14.6 mmHg for CDSS vs -11.7 mmHg for control; p = 0.005),41 whereas two others did not find differences in terms of either blood pressure control42 or change.44 Furthermore, a single study reported improved quality of life (favoring AI-CDSS),39 and another study reported a decreased rate of severe intradialytic hypotension (also favoring the AI-CDSS).38 Quantitative effect sizes and p-values were reported exclusively in one study; the other four reported only the direction of effect without providing numerical values.

3.5.5. Usability and provider acceptance

Clinicians’ involvement in CDSS was highly heterogeneous in this context. Provider-level adherence in five studies with quantitative data ranged from 17% to 74%.29,31,34,45 Only two studies included qualitative terms to define engagement (low use and high compliance)43,46 while four studies reported no adherence outcomes.32,33,36,37 User satisfaction was quantitatively evaluated in two studies and showed low or no burden, with high satisfaction rates of 74% and 82% among users,37,44 respectively. Barriers for (non-)implementation and use were reported in eight studies, among others, alert fatigue, lack of training, time pressures, unfamiliarity with CDSs or resistance to change, technical problems like application crashes, disabling printing, economic reasons, disturbances relating to COVID-19, and overall burden of users on time.29,31,33,36,37,43,45,46 Regarding the facilitators, they were cited in two studies that reported integration of their results and a reduction in false positives.32,34 Sustainability was reported in three papers: two reported plans to continue or increase use, and one found sustained error reduction.31,32,46 In general, clinician acceptance of CDSS was moderate and varied considerably; there is potential to further support implementation by integrating systems to improve CDSS adoption.

3.5.6. Guideline adherence and clinical process measures

For guideline adherence and process measures, two studies favored AI-CDSS implementation, two were neutral, and one reported no effect direction. AI-CDSS was linked to increased ACEi/ARB utilization and nephrology referral, and decreased medication errors.32,47 In contrast, adherence to ESA was not inferior to usual care or low, which probably reflects human/systemic barriers.36,45 Heterogeneity of outcomes prevented pooling; overall, benefits seem to be aimed at prescribing/referral and medication security, with adherence possibly requiring more than decision support ( Table 6).

Table 6. Guideline adherence and clinical process measures.

StudyOutcomeAI-CDSS resultsComparatorEffect direction
Lim, 202536ESA guideline adherenceNon-inferior Standard careNeutral
Sequist, 201847ACEi/ARB, nephrology referralImproved in high-risk Standard careFavors CDSS
Bhardwaja, 201132Medication errorsReducedStandard careFavors CDSS
Locatelli, 200945AdherenceLow even with CDSSStandard careNeutral

4. Discussion

This systematic review of 20 RCTs demonstrates that renal-CDSS principally enhance proximal medication safety; however, their impact on downstream clinical outcomes is unclear.2948 Most interventions were guideline-based rule systems incorporated into electronic prescribing or EHRs, with a minority being AI-powered tools; compared with control conditions across over 74,000 patients and 854 health professionals, CDSS uniformly changed prescribing behaviour towards safety (not clearly reducing mortality or progression to kidney failure). Because the implementation of the trials was so varied, in each case, we felt a pooled estimate would disguise as much as it would reveal. We thus supplemented our quantitative data with a structure-based synthesis and granular, trial-by-trial mapping ( Figures 4 and 5, Table 7, and Table S2). This framing takes the discussion beyond a dichotomous question of whether CDSS ‘works’. It reveals the real underlying structure of the research: where exactly the evidence lies in relation to different delivery modes and workflow stages — and, crucially, where, down the line, clinician evaluations are missing.

Table 7. Configuration-dependent signals and minimum reporting needs for CKD CDSS trials (Best evidence synthesis).

Delivery modeWorkflow timingTrials (n)Included trials (first author, year)Dominant functionPrimary process target(s) (examples)Process signal (trial-level)Clinical endpoints evaluated (availability; summary)Implementation measurement (availability; key types)Recurrent missing elements (to enable future pooling)
Active interruptive alertsPrescribing/order entry4Awdishu 201529; Samal 202441; Sperl-Hillen 202433; Terrell 201031DosingRenal dosing/appropriate prescribingFavors 3/4; Neutral 1/41/4; mixed4/4; adoption/use, adherence, alert burden, overridesgovernance/update; severity tiering; ownership
Active interruptive alertsOther/unclear1Bhardwaja 201132DosingMedication errors; false positivesFavors 1/10/1; not evaluated1/1; adoption/use, adherence, burden/satisfactionseverity tiering; override reasons; governance detail
Passive dashboard/promptsPrescribing/order entry1Abdel-Kader 201148RecognitionReferral/testing promptsMixed 1/10/1; not evaluated0/1; NRimplementation descriptors variably NR (see Table S1)
Standalone/mobile decision aidPrescribing/order entry1Erler 201235DosingRenal dosing exceedanceFavors 1/10/1; not evaluated0/1; NRalert modality; trigger logic; denominators
Standalone/mobile decision aidMonitoring/follow-up 1Locatelli 200945MonitoringHb/ferritin thresholdsNeutral 1/10/1; not evaluated1/1; adoption/use, alert burden, satisfactionalert modality; trigger logic; rule transparency
Standalone/mobile decision aidOther/unclear1Alhodaib 202037MonitoringDecision accuracy scoreNeutral 1/10/1; not evaluated1/1; adoption/use, burden/satisfactionworkflow efficiency NR; modality/triggering; integration detail
Unclear/NRPrescribing/order entry3Field 200930; Lim 202536; Sequist 201847Dosing/MonitoringAppropriate orders/targetsFavors 1/3; Mixed 2/31/3; no difference3/3; adoption/use, burden, overridesmodality classification; severity tiering; interface/trigger detail
Unclear/NRMonitoring/follow-up 3Chen 202239; Mancini 200738; Peralta 202044Monitoring/RecognitionMonitoring risks (e.g., hypotension/nutrition/CKD recognition)Favors 3/33/3; mixed2/3; adoption/use, satisfactionreporting gaps vary (see Table S1)
Unclear/NROther/unclear5Carroll 201842; Januzzi 202540; Patzer 201846; Peralta 202034; Samal 202243Recognition/Risk predictionCKD recognition/awareness; risk-tool labsFavors 1/5; Neutral 1/5; Mixed/Worse/NR remaining3/5; mixed/NR/no difference3/5; adoption/use, burden/satisfactionmodality; trigger logic; required inputs

4.1. Proximal medication-safety gains are consistent and clinically meaningful

The most consistent benefit to patient safety is better renal-dose appropriateness and its prescribing. Studies that included real-time alerts or order-entry checks resulted in higher doses based on kidney function, decreased use of contraindicated drugs, and reduced excessive dosing of renally eliminated drugs.2932,35,38,39 As these errors represent a standard route to drug-induced AKI and other serious events, such process gains are clinically relevant even when not all of the adjudicated ADEs have been captured. The substantial heterogeneity in early dosing trials, however, suggests that impact depends heavily on local thresholds, targeted drug classes, degree of EHR integration, and provider response to prompts.2932,35 Consistent with this, our configuration-based Evidence Map ( Figure 4) indicates that the densest evidence for process improvement clusters in interruptive, order-entry implementations, whereas clinical outcomes are reported far less consistently across configurations.

4.2. Why downstream clinical outcomes remain equivocal

By contrast, effects on renal function, cardiovascular events, and survival were inconsistent. A small number combined a CDSS with other interventions, such as practice facilitation or intensive management, and found positive eGFR trajectories and improved renal cortical blood flow, or less severe intradialytic hypotension, but negative hemoglobin control, CKD quality measures, and survival.33,36,3840,4244,48 This discrepancy between relatively obvious benefits in proximal safety process measures and neutral downstream outcomes is not surprising given that trials were powered on process endpoints; the follow up periods were short; the CKD populations studied were heterogeneous with co-morbidities and competing risks; and those running concurrently other quality initiatives (even significant reductions in dosing error rates might equate to relatively minor absolute alterations in hard outcomes).

Figure 5 further confirms the fragility of the evidence base. By visualizing the network at the trial level, we observe that clinical-stage testing is restricted to a substantially smaller number of configurations than the raw trial counts might imply. This raises a Very Big Risk: it is the nature of our beast that we aggregate patient-level data from a literature dominated by process-level data, and we do so with extreme caution.

4.3. Documentation improvements are plausible enablers but not substitutes for safety outcomes

CDSS also slightly increased EHR-documented CKD recognition in trials that focused on problem-list documentation and risk stratification; being labeled appropriately is a condition precedent to activating renal-dosing rules, monitoring protocols, or nephrology referral prompts.33,34,42,44,48 The majority of studies considered transparent rule systems that capture dosing thresholds or contraindication-related information, whereas AI-enabled tools focused on higher-level predictive tasks, such as optimizing erythropoiesis-stimulating agent dosing or estimating cardio-renal risk, rather than broader formulary-wide dose checking.2932,35,36,3840,42,48 There is randomised trial evidence that AI systems are no less effective than standard practice at focusing on process-level and physiological surrogates. To date, we have not seen any full publication of an RCT showing a clear benefit over a well-designed rule-based system in patient-centered safety. This is a crucial distinction. With this in mind, Table 7 reflects “evidence signals” that are configuration-specific, as well as ongoing reporting gaps that continue to enable us to ascertain whether a system’s success is due to its algorithm or to better workflow integration.

4.4. Implementation and governance determine whether safety potential is realized

Implementation and usability results provide insight into why the safety potential of CDSS is incompletely realized. Adherence of providers to the CDSS’s advice varied from very low to moderate. Two main and disadvantageous themes (alert fatigue, time pressure) as well as seven other recurring patterns in the qualitative findings (lack of training, weak technical integration, limited use or coordination with pharmacists, limited involvement in nephrologist consultations) were identified.29,3133,36,37,4346 High rates of non-specific alarms can desensitize clinicians and add to alert-fatigue burden (e.g., hazard related to automation when critical warnings are missed), unclear attribution for responding to high-risk activations, and weak embedding within multidisciplinary care can further dampen effects on patient clinical outcomes.32,36,45,47 This study does not characterise CDSS as software. Still, as a socio-technical intervention, it draws attention to the fact that safety is as much about governance and workflow as it is about algorithmic accuracy. This transition in perspective is also why we deviated from treating transfer modes and workflow kinetics as frozen “background” variables and chose to treat them as dynamic, living entities. To put these findings into context, Extended data (Appendix 2: Trial-level mapping table) of the current study offers an in-depth examination of the trials included, while ‘seeing beyond the tell tale’. Deconstructing how this evidence was collected, we seek to distil lessons for what our ‘current’ evidence base can actually support–an exercise we see as important in the face of recent technological developments that could potentially render such data obsolete.

4.4. Implications for practice: positioning renal CDSS within quality and patient safety programs

To health systems and quality-improvement leaders, none of this proves that renal-dose CDSS raises survival. Current evidence suggests treating the thing as safety-critical infrastructure whose primary value in tertiary analyses may be preventing avoidable prescribing errors such as overdose of renally cleared agents, failure to adjust dosing down with falling eGFR, lapse in initiation of ACEi/ARB therapy for most indications other than AKI with HFrEF, or lag in being referred to nephrology.2932,35,38,39,45,47 Deployment should accompany medication safety initiatives in programs, including pharmacist-led review of high-risk prescriptions, standardized CKD order sets, and promoting drug-kidney education. A few simple metrics, the proportion of renally inappropriate orders, patterns of overrides for high-risk alerts, and serious medication-related events can recalibrate thresholds, eliminate low-value rules, and pinpoint services calling for support. In contrast, alert design should focus on high-risk scenarios with straightforward recommendations to incorporate into the workflow of ordering.29,3133,36,37,39,40,4246,48 Our configuration-based synthesis further suggests that implementation decisions (e.g., interruptive alerts at order entry vs non-interruptive dashboards) should be considered part of the intervention itself, because they condition adoption and the plausibility of achieving clinically meaningful downstream impact.

4.6. Future research agenda

Three guiding considerations arise from the existing evidence.2932,35,36,38,40,42,48 Pragmatic (sufficiently powered, head-to-head RCT) comparisons between AI-CDSS and mature rule-based systems are required to establish whether advanced prediction and personalization provide incremental safety or cost-effectiveness gains beyond those achieved with effectively implemented rules. Second, a CKD-CDSS core outcome set, which includes measurement of renal-dose accuracy: EHR-documented CKD recognition; nephrology referral; blood pressure control; and important patient safety outcomes such as acute kidney injury (AKI), serious ADRs, or MHRs, improves comparability of studies. Third, new trials should adhere to such frameworks as CONSORT-AI for designing more complex interventions that establish implementation parameters as primary outcomes driven by mediation analyses linking prescription-process gains to clinical endpoints, with consideration of transparency, bias (eg, endogenously impacting trained vs nontrained sites), data drift crossover, and Equity in AI-CDSS models. To address the recurring gaps in the evidence observed ( Table 7, Extended data in Appendix 2: Trial-level mapping table), future trials may need to include better reporting of outcomes. This encompasses detailed trigger logic, the alerting delivery mechanism, and levels of severity, as well as a mindset toward the reporting of override data. And indeed, the slogan that ownership lineage and governance for model updates are a substitute for what is wanted has proven to be very true. These are the factors that determine whether a study’s data can contribute to interpretation or be included in subsequent meta-analyses.

4.7. Strengths and limitations

This systematic review was reported according to the PRISMA guidelines, was prospectively registered in PROSPERO, and included only randomized controlled trials, thereby improving internal validity compared with mixed-design reviews. When outcome-specific RoB 2 evaluations were applied and random-effects models with fixed-effect sensitivity analyses fitted, there was evidence that process outcomes differed from clinical endpoints and a physical apparent pattern in time; i.e., earlier dosing studies yielded larger, although more heterogeneous effects, while more recent EHR-integrated trials gave rise to smaller but less diverse gains in CKD awareness and associated processes. Another strength is the structured nature of the synthesis and the visualization of the architecture of the evidence at the trial level ( Figure 4 and Figure 5, Table 7). More than just creating identifiable pooled effects, these methods provide a nuanced tool for navigating research in which heterogeneity is so vast. They guard against the interpretive myopia that would allow our calculations to be reported but not properly taken into account in their specific context.

It is important limitations must also be recognized. There was significant heterogeneity across many of the most important outcomes, driven by differences in outcome definitions and trial design; only a proportion of eligible trials contributed to each meta-analysis. In particular settings, there were too few studies with effect estimates to allow informative subgroup analyses or exploration of publication bias. Key domains were at risk of “some concern” or “high” bias in several studies, and the majority of studies took place in high-income countries with well-established EHR systems. Few trials explicitly quantified adjudicated adverse drug events and other patient-centred safety measures, so estimates of harm reductions continue to be based largely on the premise that improvements in prescribing practice translate into less harm.2932,35,38 We acknowledge, however, that the most consistent signal is around improved renal dosing and related prescribing practices. Finally, because trial reporting of configuration and implementation details was frequently incomplete, the Evidence Map and network visualization necessarily reflect what was reported; Table S2 is therefore provided to transparently document reporting presence/absence and to guide more reproducible future evaluations.

5. Conclusion

Rule-based and AI-driven CDSS have successfully augmented proximal drug safety in CKD by increasing the accuracy of dosing and prescribing practices. However, it has been challenging to demonstrate the link between these systems and long-term clinical endpoints, such as renal function, CVD outcomes, or overall mortality. One cannot escape one important conclusion from the hodgepodge of studies in the literature. There is no inherent property that determines whether a CDSS will be effective; instead, it is contingent on the manner in which that system interacts with clinical work processes. Such findings indicate a new paradigm, from CDSS as a complement to CDSS as a safety-critical infrastructure. The underpinning of safe systems that are both safe and trustworthy requires a requisite enabling environment, inclusive of regulation, a human-centered design approach, and adherence to long-term monitoring. The focus of science should thus shift to more systematic head-to-head comparisons of AI vs. traditional rule-based systems. Sufficient data quality that adds to global-synthesis projects implies high study power and full reporting transparency; these are fundamental hallmarks of good research, which should not have to be traded off for the “opportunity” to take part in larger studies.

Declaration of generative AI and AI-assisted technologies in the manuscript preparation process

During the preparation of this paper, the authors used AI-assisted language tools (ChatGPT and Grammarly) to help improve English grammar and phrasing in these typescripts. All content of the scientific manuscripts (study design, data extraction and analysis, interpretation, and conclusions) has been developed by the authors. No AI tool produced any original scientific content. The authors thoroughly examined all AI-generated text and fully accept responsibility for the final manuscript content.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 16 Apr 2026
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Ismail A, Sauriasari R, Yanuar A et al. Improving Medication Safety in Chronic Kidney Disease Using Rule-Based and Artificial Intelligence–Based Clinical Decision Support Systems: A Systematic Review of Randomized Controlled Trials [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:532 (https://doi.org/10.12688/f1000research.178438.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 16 Apr 2026
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.