Protocol for the perioperative outcome risk assessment with computer learning enhancement (Periop ORACLE) randomized study

Bradley Fritz; Christopher King; Yixin Chen; Alex Kronzer; Joanna Abraham; Arbi Ben Abdallah; Thomas Kannampallil; Thaddeus Budelier; Arianna Montes de Oca; Sherry McKinnon; Bethany Tellor Pennington; Troy Wildes; Michael Avidan

doi:10.12688/f1000research.122286.2

Home Browse Protocol for the perioperative outcome risk assessment with computer...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Study Protocol

Revised

Protocol for the perioperative outcome risk assessment with computer learning enhancement (Periop ORACLE) randomized study

[version 2; peer review: 2 approved]

Bradley Fritz ¹, Christopher King¹, Yixin Chen², [...] Alex Kronzer¹, Joanna Abraham^1,3, Arbi Ben Abdallah¹, Thomas Kannampallil^1,3, Thaddeus Budelier¹, Arianna Montes de Oca¹, Sherry McKinnon¹, Bethany Tellor Pennington¹, Troy Wildes¹, Michael Avidan¹

Bradley Fritz ¹, Christopher King¹, [...] Yixin Chen², Alex Kronzer¹, Joanna Abraham^1,3, Arbi Ben Abdallah¹, Thomas Kannampallil^1,3, Thaddeus Budelier¹, Arianna Montes de Oca¹, Sherry McKinnon¹, Bethany Tellor Pennington¹, Troy Wildes¹, Michael Avidan¹

PUBLISHED 29 Sep 2022

Author details Author details

¹ Department of Anesthesiology, Washington University School of Medicine, St. Louis, Missouri, 63110, USA
² Department of Computer Science and Engineering, Washington University McKelvey School of Engineering, St. Louis, Missouri, 63130, USA
³ Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, 63110, USA

Bradley Fritz
Roles: Conceptualization, Funding Acquisition, Methodology, Supervision, Writing – Original Draft Preparation

Christopher King
Roles: Conceptualization, Methodology, Software, Writing – Review & Editing

Yixin Chen
Roles: Conceptualization, Funding Acquisition, Resources, Writing – Review & Editing

Alex Kronzer
Roles: Data Curation, Software, Writing – Review & Editing

Joanna Abraham
Roles: Conceptualization, Funding Acquisition, Writing – Review & Editing

Arbi Ben Abdallah
Roles: Methodology, Writing – Review & Editing

Thomas Kannampallil
Roles: Methodology, Writing – Review & Editing

Thaddeus Budelier
Roles: Methodology, Writing – Review & Editing

Arianna Montes de Oca
Roles: Methodology, Writing – Review & Editing

Sherry McKinnon
Roles: Project Administration, Writing – Review & Editing

Bethany Tellor Pennington
Roles: Methodology, Writing – Review & Editing

Troy Wildes
Roles: Conceptualization, Methodology, Resources, Writing – Review & Editing

Michael Avidan
Roles: Conceptualization, Funding Acquisition, Methodology, Resources, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

This article is included in the AI in Medicine and Healthcare collection.

Abstract

Background: More than four million people die each year in the month following surgery, and many more experience complications such as acute kidney injury. Some of these outcomes may be prevented through early identification of at-risk patients and through intraoperative risk mitigation. Telemedicine has revolutionized the way at-risk patients are identified in critical care, but intraoperative telemedicine services are not widely used in anesthesiology. Clinicians in telemedicine settings may assist with risk stratification and brainstorm risk mitigation strategies while clinicians in the operating room are busy performing other patient care tasks. Machine learning tools may help clinicians in telemedicine settings leverage the abundant electronic health data available in the perioperative period. The primary hypothesis for this study is that anesthesiology clinicians can predict postoperative complications more accurately with machine learning assistance than without machine learning assistance.
Methods: This investigation is a sub-study nested within the TECTONICS randomized clinical trial (NCT03923699). As part of TECTONICS, study team members who are anesthesiology clinicians working in a telemedicine setting are currently reviewing ongoing surgical cases and documenting how likely they feel the patient is to experience 30-day in-hospital death or acute kidney injury. For patients who are included in this sub-study, these case reviews will be randomized to be performed with access to a display showing machine learning predictions for the postoperative complications or without access to the display. The accuracy of the predictions will be compared across these two groups.
Conclusion: Successful completion of this study will help define the role of machine learning not only for intraoperative telemedicine, but for other risk assessment tasks before, during, and after surgery.
Registration: ORACLE is registered on ClinicalTrials.gov: NCT05042804; registered September 13, 2021.

Keywords

Anesthesiology, Machine Learning, Postoperative Complications, Protocol, Surgery

Corresponding author: Bradley Fritz

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by the National Institute of Nursing Research [R01NR017916 to Dr. Avidan]; the Foundation for Anesthesia Education and Research [MRTG08152020 to Dr. Fritz]; the National Center for Advancing Translational Sciences [KL2TR002346 to Dr. King]; the National Science Foundation [1622678 to Dr. Avidan and Dr. Chen] and the Agency for Healthcare Research and Quality [R21HS024581 to Dr. Avidan].
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2022 Fritz B et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Fritz B, King C, Chen Y et al. Protocol for the perioperative outcome risk assessment with computer learning enhancement (Periop ORACLE) randomized study [version 2; peer review: 2 approved]. F1000Research 2022, 11:653 (https://doi.org/10.12688/f1000research.122286.2) First published: 14 Jun 2022, 11:653 (https://doi.org/10.12688/f1000research.122286.1) Latest published: 29 Sep 2022, 11:653 (https://doi.org/10.12688/f1000research.122286.2)

Revised Amendments from Version 1

This revised version was edited in response to the reviewer comments. Additional methods information has been added describing the training and validation of the machine learning predictions. In addition, we have added clarification that patients on preoperative dialysis, undergoing dialysis access procedures, or with baseline creatinine > 4.0 mg/dl will be excluded from acute kidney injury analyses. We have also added some sensitivity analyses and secondary outcomes suggested by the reviewer.

See the authors' detailed response to the review by Michael Robert Mathis

Introduction

Background and rationale

Each year, more than four million people worldwide die within 30 days after surgery, making perioperative events the third-leading cause of death.¹ One common postoperative complication that is associated with an increased risk of death is acute kidney injury (AKI).²^–⁶ In one retrospective study, postoperative AKI occurred in 39% of hospitalized surgical patients who had their creatinine checked after surgery.⁷ In-hospital mortality increased from 0.6% among those without AKI to 8.8% among those who experienced AKI.

Some postoperative deaths and AKI may be prevented through identification and modification of risk factors. Although some risk factors (e.g., age⁸^–¹⁰) are not modifiable and other risk factors (e.g., preoperative glycemic control⁸^,¹¹^,¹²) are no longer modifiable once surgery has begun, several important risk factors are under the anesthesiologist’s control during surgery. For example, intraoperative hypotension is a well-established risk factor both for postoperative death¹³^–¹⁵ and for postoperative AKI.¹⁵^–¹⁷ Interventions on such risk factors may therefore affect outcomes.¹⁸^,¹⁹ Appropriate adjustments to the postoperative care plan may also impact outcomes.¹⁹

Telemedicine has successfully improved mortality and other outcomes in the intensive care unit (ICU), and similar results may be expected in the operating room. In a stepped-wedge trial across units in a single medical center, tele-ICU implementation was associated with an absolute mortality reduction of nearly 2%.²⁰ In a subsequent multi-center pre-post study, tele-ICU was associated with reduced in-hospital mortality and length of stay.²¹ Improved outcomes were associated with enhanced compliance with best clinical practice guidelines.²⁰^,²¹ The effect of telemedicine on outcomes may depend on characteristics of the patients in question—in one study, tele-ICU implementation improved mortality only among patients with higher illness severity scores.²²

The effectiveness of a telemedicine intervention depends on how well clinicians working in the telemedicine setting can assess patient risk and identify potential interventions to reduce risk. Accurate intraoperative risk assessment by clinicians is challenging for the following reasons. First, the sheer volume of data available is more than the human brain can comprehend with its limited data processing capacity.²³^,²⁴ Competing clinical demands reduce the cognitive capacity available to process these data.²⁵^,²⁶ Second, anesthesiology clinicians frequently use the available data to make decisions that are not based on sound logic.²⁷ Clinicians may anchor on the first diagnosis that comes to mind, only seek out data that confirm a previously suspected diagnosis, or interpret ambiguous findings with a false sense of certainty.²⁷ These concerns become increasingly significant as clinicians monitor increasing numbers of patients and have less time to review each patient’s information.

Clinicians in telemedicine settings may use machine learning (ML) tools to address their innate cognitive deficiencies. Computers can process larger quantities of data more quickly than a human, model more complex relationships and interactions among inputs, and will not fatigue.²⁸^–³⁰ The ability of clinicians to integrate the output of ML tools into their overall assessment of the patient depends on the ML output being presented in a manner that makes sense to the clinician and caters to the clinician’s information needs.³¹^,³² These needs will vary depending on the clinician’s background and the context in which the clinician is working.

Preliminary data

This study is possible because the investigators have previously developed ML algorithms for predicting postoperative death and AKI. They used a retrospective cohort of approximately 110,000 patients who underwent surgical procedures at Barnes-Jewish Hospital between 2012 and 2016.³³ This dataset was divided into a training set (for model parameter tuning), a validation set (for hyperparameter tuning), and a test set (for quantifying model performance) in a ratio of approximately 7:1:2. Inputs to the models included patient characteristics, health conditions, preoperative vital signs and laboratory values, and intraoperative vital signs and medications. The resulting model predicted postoperative death with excellent accuracy (area under the receiver operating characteristic curve of 0.91, 95% confidence interval 0.90-0.92).³⁴^,³⁵ A separate model predicted AKI with good accuracy (area under the receiver operating characteristic curve of 0.82, 95% confidence interval 0.81-0.84).³⁶

In addition, the investigators have recently conducted a series of focus groups and user interviews with anesthesiology clinicians (unpublished data) to learn about their workflows and information needs when working in the intraoperative telemedicine unit at Barnes-Jewish Hospital. Based on these insights, the investigators have designed a display interface that shows ML predictions for postoperative death and AKI in real-time during surgery.

Objective

To determine whether anesthesiology clinicians in a telemedicine setting can predict postoperative death and AKI more accurately with ML assistance than without ML assistance. The hypothesis is that clinician predictions will be more accurate with ML assistance than without ML assistance.

Overall study design

The Perioperative Outcome Risk Assessment with Computer Learning Enhancement (Periop ORACLE) study will be a sub-study nested within the ongoing TECTONICS trial (NCT03923699). TECTONICS is a single-center randomized clinical trial assessing the impact of an anesthesiology control tower (ACT) on postoperative 30-day mortality, delirium, respiratory failure, and acute kidney injury.³⁷ As part of the TECTONICS trial, clinicians in the ACT perform medical record case reviews during the early part of surgery and document how likely they feel each patient is to experience postoperative death and AKI. In Periop ORACLE, these case reviews will be randomized to be performed with or without access to ML predictions.

Methods: Participants, interventions, and outcomes

Study setting

The study will be conducted at Barnes-Jewish Hospital, a 1,252-bed university-affiliated tertiary care facility in St. Louis, MO. About 19,000 inpatient surgeries are performed in the hospital’s 58 operating rooms each year.

Eligibility criteria

The participants will include all patients enrolled in the TECTONICS trial during the 12-month sub-study period for whom the ACT clinicians conduct a case review. The inclusion criteria for TECTONICS include (1) surgery in the main operating suite at Barnes-Jewish Hospital, (2) surgery during hours of ACT operation (weekdays 7:00am-4:00pm), and (3) age ≥ 18. Exclusion criteria include procedures performed without anesthesiology services.

Recruitment

Participants are currently enrolled in TECTONICS via a waiver of consent. The investigators have obtained a waiver of informed consent to include these patients in Periop ORACLE as well.

Interventions – Machine learning algorithms

The machine learning models used in this study were originally trained and validated on a retrospective cohort of approximately 110,000 adult patients who underwent surgery with general anesthesia at Barnes-Jewish Hospital between 2012 and 2016. Input features included demographic characteristics, comorbid conditions, preoperative vital signs, surgical service, functional capacity as documented during the preoperative assessment, and most recent values of selected laboratory tests. A random forest model was implemented in scikit-learn. In the holdout validation cohort of 21,171 patients, the incidence of postoperative death was 2.2% and the model predicted this outcome with receiver operating characteristic area under curve (AUC) of 0.939 and precision-recall AUC of 0.161. The incidence of postoperative AKI was 6.1% and the model predicted this outcome with receiver operating characteristic AUC of 0.799 and precision-recall AUC of 0.275.

In February 2022, the models were retrained using a newer cohort of 84,455 patients who underwent surgery with general anesthesia at Barnes-Jewish Hospital between 2018 and 2020. This time period was after the hospital had transitioned from its previous electronic health record systems (including MetaVision as its anesthesia information management system) to Epic (Epic, Verona, WI). Input features were the same as the previous models, with the addition of the planned surgical procedure text field. A regularized logistic regression was used to predict the outcome from the words in the planned surgical procedure text field. This was used to initialize a gradient boosted decision tree, which was trained using the remaining features. Hyperparameters were selected by 10-fold cross validation. Models were implemented in XGBoost. In the holdout validation cohort of 16,891 patients, the incidence of postoperative death was 1.9% and the model predicted this outcome with receiver operating characteristic AUC of 0.91 and precision-recall AUC of 0.27. The incidence of postoperative AKI was 13.3% and the model predicted this outcome with receiver operating characteristic AUC of 0.90 and precision-recall AUC of 0.66.

A password-protected web application on a secure server has been created for delivery of machine learning algorithm outputs to clinicians. The design of this web application was informed by input obtained by users (clinicians from the ACT) in focus group sessions. For each patient, the machine learning predictions are presented in the form of predicted probabilities of each outcome (e.g., 4.5% chance of AKI). In addition, a list of features contributing most to the prediction is shown, along with Shapley values. During the focus group meetings, most users said they would find it overwhelming to see confidence intervals around the predicted probabilities. A fact sheet about each model is available on demand, including the receiver operating characteristic curve, precision-recall curve, and calibration curve.

Interventions – Implementation and workflow

Clinicians in the ACT currently conduct case reviews by viewing the patient’s records in AlertWatch (AlertWatch, Ann Arbor, MI) and Epic. AlertWatch is an FDA-approved patient monitoring system designed for use in the operating room. Clinicians in the ACT use a customized version of AlertWatch that has been adapted for use in a telemedicine setting.³⁸ Epic is the electronic health record system utilized at Barnes-Jewish Hospital. For patients included in Periop ORACLE, each case review will be randomized in a 1:1 fashion to be completed with or without ML assistance. If the case review is randomized to ML assistance, the clinician will access the machine learning web application (see previous section) during the case review. Study staff will be immediately available in the ACT during all case reviews to assist with accessing the display interface if needed to improve adherence to the protocol. If the case review is not randomized to ML assistance, the clinician will not access this web application. This choice of comparator mimics current practice more closely than using an active comparator. After viewing the patient’s data, the clinician will complete a case review from in AlertWatch (as described in the data collection section later in this document).

Outcomes

The co-primary outcomes will be clinician accuracy in predicting postoperative death and clinician accuracy in predicting postoperative AKI. Clinician predictions will be retrieved from the case review forms completed in AlertWatch. Observed death and observed AKI will be retrieved from Epic. Death will be defined as 30-day in-hospital mortality. AKI will be defined as a creatinine increase ≥0.3 mg/dl above baseline within 48 hours or an increase to ≥1.5 times baseline within seven days, consistent with the kidney disease: improving global outcomes definition.³⁹ If no preoperative creatinine is available, then the upper limit of the laboratory’s reference range (1.2 mg/dl) will be used as the baseline.

Secondary outcomes will include AKI stage 2 or greater (creatinine increase to ≥2 times baseline within seven days) and AKI stage 3 (creatinine increase to ≥3 times baseline within seven days, an increase to ≥4.0 mg/dl, or initiation of renal replacement therapy).

Participant timeline

No direct interactions between study staff and the participants are planned, and no anesthetic interventions will be prohibited based on participation in this trial. The initial medical record review and clinician predictions will occur during the participants’ surgery. Additional data retrieval to obtain observed death and AKI will occur at least 30 days after surgery (and may occur in bulk 30 days after the final participant’s surgery).

Sample size

The sample size calculation is based on the assumption that ML-assisted clinicians would predict each outcome with receiver operating curve area under curve (AUC) similar to the published AUC of the ML algorithms.³⁴^,³⁶ Incidences of death and AKI were also taken from these previous publications. A simulation population of 100,000 patients was generated. ML-assisted and -unassisted clinician predictions were simulated with beta distributions whose parameters were adjusted to achieve the specified AUC. For each sample size tested, 1,000 random samples were drawn and the difference in AUC between the assisted and unassisted clinician predictions was determined.⁴⁰ Power at each sample size was defined as the fraction of samples for which the two sets of predictions had significantly different AUC at α = 0.025. The minimum clinically meaningful difference (MCMD) in AUC was defined as 0.07.

As shown in Figure 1, a sample size of 4,500 will provide 80% power to detect a difference in AUC from 0.91 to 0.84 (the MCMD) for death and 95% power to detect a difference from 0.91 to 0.81. This sample size will give >99% power to detect a difference in AUC from 0.82 to 0.75 (the MCMD) for AKI (Figure 2).

Figure 1. Simulation results from power calculation for death.

Power achieved in simulations of various sample sizes and effect sizes for postoperative death. Blue line is minimum clinically meaningful difference.

Figure 2. Simulation results from power calculation for AKI.

Power achieved in simulations of various sample sizes and effect sizes for postoperative AKI. Blue line is minimum clinically meaningful difference.

To allow for 15% missing data, we will enroll 5,300 cases. Currently, about 20 case reviews are performed in the ACT on a typical day. We should therefore be able to complete enrollment over a period of 12 months.

Methods: Assignment of interventions

Allocation

Each participant case review will be randomized in a 1:1 fashion to be completed with or without ML assistance. Randomization will be stratified by intervention/control status of the parent trial because clinicians may be biased to perform case reviews more carefully in the TECTONICS intervention group than in the TECTONICS control group (Figure 3). The allocation sequence will be generated by computer-generated random numbers within the AlertWatch software. The allocation will be automatically displayed on the case review form when the clinician opens it. The study staff will not have access to the allocation sequence before the case review.

Figure 3. Flow chart showing treatment allocation.

Periop ORACLE participants will be randomizated to ML assistance or no ML assistance, stratified by intervention or control status of the parent TECTONICS trial.

Blinding

By necessity, the clinician completing the case review and the other study staff in the ACT will not be blinded to treatment allocation. Study personnel who retrieve observed postoperative complications from the electronic health record will be blinded.

Methods: Data collection, management, and analysis

Data collection

Clinician predictions will be collected via an existing case review form in AlertWatch that contains two sections (Figure 4). In the first section, the clinician documents how likely the patient is to experience postoperative death within 30 days, postoperative AKI within seven days, and a few other complications. The clinician selects their choice from a five-point ordered categorical scale (very low risk, low risk, average risk, high risk, and very high risk). In the second section (which is part of the parent TECTONICS trial but not used for this sub-study), the clinician selects treatment recommendations (e.g., blood pressure goals, medications to administer or avoid). An additional question asks the clinician to confirm whether they reviewed the ML display interface prior to documenting their predictions in section one. Finally, clinicians are asked whether the ML predictions on the display interface agree with their previous opinion and whether the display interface caused the clinician to change their opinion.

Figure 4. Case review form in AlertWatch.

The first two sections contain fields for documentation of clinician predictions of postoperative complications, and the Periop ORACLE randomization allocation is disclosed in the header of the second section. The treatment recommendations documented in the third section are utilized for the parent TECTONICS trial but not for this sub-study. The amount of time the clinician in the ACT spends completing each case review will be retrieved from the Epic audit log.

Observed death and observed AKI will be retrieved from Epic via a query of the Clarity database. Observed outcomes will be retrieved for all randomized participants, including those with protocol deviations.

Data management

The clinical applications used in this project (AlertWatch and Epic) can only be accessed over the secure institutional internet network or using virtual private network, and user authentication is required for access. Computations for the ML models will occur on institutional servers that meet the standards of the Health Insurance Portability and Accountability Act. The ML display interface is also hosted on an institutional server, can only be accessed over the secure institutional internet network or using virtual private network, and requires user authentication for access. For data analysis, the amount of protected health information retrieved will be limited to the minimum amount necessary to achieve the aims of the study. All study data will be stored on institutional servers, and access will be limited to study personnel.

Statistical methods

To compare the accuracy of clinician predictions with and without ML assistance, the investigators will construct logistic regressions for death and for AKI. Separate models will be constructed for the ML-assisted and -unassisted groups. The only inputs will be dummy variables encoding the clinician predictions (Table 1). The regression coefficients will be restricted to positive values, thus modeling a monotonic increasing relationship between the clinician predictions and the true incidence of death and AKI. The AUC of the model constructed from ML-assisted cases will be compared to the AUC of the model constructed from ML-unassisted cases,⁴⁰ using the Holm method to ensure the family-wise error rate remains less than a two-sided α = 0.05 across the two co-primary outcomes—accuracy of death prediction and accuracy of AKI prediction. The null hypothesis is that the AUCs will be equal.

Table 1. Dummy variable encoding for clinician predicted risk of postoperative death.

Clinician prediction	Var1	Var2	Var3	Var4
Very low	0	0	0	0
Low	1	0	0	0
Average	1	1	0	0
High	1	1	1	0
Very high	1	1	1	1

The primary analysis will follow an intention-to-treat principle, and all case reviews will be included in the group to which they were randomized. Patients on dialysis preoperatively, undergoing dialysis access procedures, or with baseline creatinine > 4.0 mg/dl will be excluded from the AKI analysis but included in the death analysis. In a secondary per-protocol analysis, case reviews will be grouped according to whether the clinician reported viewing the ML display or not. Exploratory analyses stratified by sex and race will evaluate for biases learned during training.⁴¹ If either clinician predictions or observed outcomes are missing for a given case, then the case will be excluded from the analysis. No interim analyses are planned. To determine whether different levels of clinician engagement in the TECTONICS intervention group versus the TECTONICS control group impact the findings, sensitivity analyses will be conducted stratified by TECTONICS intervention status. To examine the effects of dataset shift and model retraining, the prospective performance of the models will be reported for each month of the study, and sensitivity analyses will be conducted in the subgroups who had surgery before and after the February 2022 retraining event.

To examine human-computer agreement, the proportion of cases for which the clinician reported being surprised by the ML prediction will be determined. Among those cases for which the clinician was surprised, the proportion for which the clinician self-reported agreeing or disagreeing with the ML prediction will be determined.

To examine the potential impact of inaccurate ML predictions, we will conduct the following sensitivity analysis. First, the predicted probabilities output by the ML algorithms will be converted to dichotomous predictions by using the cutoff value that maximizes the Youden index. Second, the cases will be categorized as having correct or incorrect ML dichotomized predictions. Finally, the primary analysis will be repeated in the subgroup with correct ML predictions and in the subgroup with incorrect ML predictions.

To estimate the effect of the ML predictions on case review efficiency, chart review duration (approximated using the time the Epic chart was open, retrieved from the audit log) will be compared between case reviews performed with ML assistance and those performed without ML assistance, using either an unpaired T test or Wilcoxon rank sum test as appropriate.

Ethics and dissemination

Ethical statement

This study has been approved by the Human Research Protection Office at Washington University in St. Louis (approval #202108022) on August 26, 2021. Any protocol amendments will be approved by the study steering committee and communicated with the institutional review board. This study presents patients with minimal risks, other than a small risk for a breach of confidentiality if protected health information were to become unintentionally available to individuals outside the study team. To protect against this risk, all electronic data will be kept in an encrypted, password-protected environment accessible only to the research team (see Data Management section). Because the risks are minimal, no dedicated safety monitoring committee is planned for Periop ORACLE. However, the parent TECTONICS trial does have a safety monitoring committee. The institutional review board has granted a waiver of informed consent to enroll patients in this study.

Study results will be presented at national or international scientific meetings and published in a peer-reviewed publication. Individuals who meet the International Committee of Medical Journal Editors authorship guidelines (https://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html) will be included as authors on any publications resulting from this work. To comply with data sharing recommendations, de-identified individual participant data underlying the study results will be made available to researchers who provide a methodologically sound proposal for utilizing that data.

Strengths, limitations, and alternative strategies

This project has multiple strengths. First, this protocol has been prepared in accordance with Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) guidelines.⁴²^,⁴³ Second, the sample size of predictions made by experienced anesthesia clinicians will be large. The pragmatic design of superimposing Periop ORACLE on the established TECTONICS trial where many case reviews are already being conducted on a daily basis makes it feasible to achieve such a large sample size. Third, the ML models to be used have very good AUC, which enhances the likelihood that ML assistance can increase the accuracy of clinician predictions. Fourth, the ML display interface has been created using a human-centered design framework informed by the clinicians who work in the ACT, which maximizes the chances that clinicians will integrate the ML display interface into their workflows and their decision-making.

This project also has limitations. First, data collection in a telemedicine setting rather than in the operating room may limit the generalizability of the findings to institutions that do not utilize intraoperative telemedicine. However, multiple clinicians have stated during focus group interviews that their workflows for performing case reviews in the ACT closely mimic their workflows for preparing to provide bedside anesthesiology care. Thus, the findings may be relevant in the operating room as well. Second, this is a single-center study, so differences in patient population or practice patterns at other institutions may limit generalizability. However, many large academic medical centers care for patient populations similar to those seen at Barnes-Jewish Hospital. Third, some of the outcomes may be incompletely measured. While in-hospital vital status should always be available, some patients may not have a postoperative creatinine measurement available to distinguish whether AKI has occurred. However, AKI is expected to be extremely uncommon among patients without postoperative labs, minimizing the effect of this potential bias. Fourth, AKI is defined using creatinine only and not urine output, based on anticipated data availability. This may cause some cases of AKI to be missed. However, during ML model training, the incidence of AKI using this definition was compatible with the incidence of AKI reported in other studies. Finally, clinicians may give the ML display interface varying degrees of weight during case reviews randomized to ML assistance. Understanding this variability is of interest to the research team, which is why the case review form will ask the clinician how the ML display impacted their decision-making.

The expected result is that prediction accuracy for death and for AKI to be greater with ML assistance than without ML assistance. If the null hypothesis is rejected for only one of the two co-primary outcomes, this result will still be viewed as evidence that the ML assistance is beneficial. A possible unintended consequence would be if false negative ML predictions lead telemedicine clinicians to pay less attention to some patients who are actually at high risk for death or AKI.²² Even if ACT clinicians monitor these patients less intensely, these patients will still receive standard-of-care monitoring and care by clinicians in the operating rooms. Thus, the reduction in telemedicine monitoring should not result in patient harm. The telemedicine clinicians supplement, but do not replace, standard care.

Conclusion

Intraoperative telemedicine has the potential to improve postoperative outcomes if interventions can be targeted to patients most at-risk for complications, and ML may be able to help clinicians distinguish which patients those are. Periop ORACLE will test the hypothesis that anesthesiology clinicians in a telemedicine setting can predict postoperative complications more accurately with ML assistance than without ML assistance. By nesting this study within the ongoing TECTONICS randomized clinical trial of intraoperative telemedicine, the investigators will efficiently assemble a large dataset of clinician predictions that will be used to achieve the study objective. Successful completion of this study will help define the role of ML not only for intraoperative telemedicine, but for other risk assessment tasks before, during, and after surgery.

Data availability

No data are associated with this article.

Reporting guidelines

This protocol is presented according to the SPIRIT guidelines.

Open Science Framework: “Protocol for the Perioperative Outcome Risk Assessment with Computer Learning Enhancement (Periop ORACLE) Randomized Study”. https://doi.org/10.17605/OSF.IO/GC4ES.⁴³

Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).

References

1. Nepogodiev D, Martin J, Biccard B, et al.: Global burden of postoperative death. Lancet 2019; 393(10170): 401. PubMed Abstract | Publisher Full Text
2. Kheterpal S, Tremper KK, Englesbe MJ, et al.: Predictors of postoperative acute renal failure after noncardiac surgery in patients with previously normal renal function. Anesthesiology 2007; 107(6): 892–902. PubMed Abstract | Publisher Full Text
3. Kheterpal S, Tremper KK, Heung M, et al.: Development and validation of an acute kidney injury risk index for patients undergoing general surgery: results from a national data set. Anesthesiology 2009; 110(3): 505–515. PubMed Abstract | Publisher Full Text
4. Engoren M, Habib RH, Arslanian-Engoren C, et al.: The effect of acute kidney injury and discharge creatinine level on mortality following cardiac surgery. Crit. Care Med. 2014; 42(9): 2069–2074. PubMed Abstract | Publisher Full Text
5. Rydén L, Ahnve S, Bell M, et al.: Acute kidney injury after coronary artery bypass grafting and long-term risk of myocardial infarction and death. Int. J. Cardiol. 2014; 172(1): 190–195. PubMed Abstract | Publisher Full Text
6. Dardashti A, Ederoth P, Algotsson L, et al.: Incidence, dynamics, and prognostic value of acute kidney injury for death after cardiac surgery. J. Thorac. Cardiovasc. Surg. 2014; 147(2): 800–807. PubMed Abstract | Publisher Full Text
7. Hobson C, Ozrazgat-Baslanti T, Kuxhausen A, et al.: Cost and mortality associated with postoperative acute kidney injury. Ann. Surg. 2015; 261(6): 1207–1214. PubMed Abstract | Publisher Full Text
8. Palomba H, De Castro I, Neto A, et al.: Acute kidney injury prediction following elective cardiac surgery: AKICS Score. Kidney Int. 2007; 72(5): 624–631. PubMed Abstract | Publisher Full Text
9. Abelha FJ, Botelho M, Fernandes V, et al.: Determinants of postoperative acute kidney injury. Crit. Care 2009; 13(3): R79. PubMed Abstract | Publisher Full Text
10. van Gestel YR , Lemmens VE, de Hingh IH , et al.: Influence of comorbidity and age on 1-, 2-, and 3-month postoperative mortality rates in gastrointestinal cancer patients. Ann. Surg. Oncol. 2013; 20(2): 371–380. Publisher Full Text
11. Chrastil J, Anderson MB, Stevens V, et al.: Is hemoglobin A1c or perioperative hyperglycemia predictive of periprosthetic joint infection or death following primary total joint arthroplasty? J. Arthroplast. 2015; 30(7): 1197–1202. Publisher Full Text
12. Halkos ME, Lattouf OM, Puskas JD, et al.: Elevated preoperative hemoglobin A1c level is associated with reduced long-term survival after coronary artery bypass surgery. Ann. Thorac. Surg. 2008; 86(5): 1431–1437. PubMed Abstract | Publisher Full Text
13. Monk TG, Saini V, Weldon BC, et al.: Anesthetic management and one-year mortality after noncardiac surgery. Anesth. Analg. 2005; 100(1): 4–10. PubMed Abstract | Publisher Full Text
14. Monk TG, Bronsert MR, Henderson WG, et al.: Association between intraoperative hypotension and hypertension and 30-day postoperative mortality in noncardiac surgery. Anesthesiology 2015; 123(2): 307–319. PubMed Abstract | Publisher Full Text
15. Walsh M, Devereaux PJ, Garg AX, et al.: Relationship between intraoperative mean arterial pressure and clinical outcomes after noncardiac surgery: toward an empirical definition of hypotension. Anesthesiology 2013; 119(3): 507–515. PubMed Abstract | Publisher Full Text
16. Sun LY, Wijeysundera DN, Tait GA, et al.: Association of intraoperative hypotension with acute kidney injury after elective noncardiac surgery. Anesthesiology 2015; 123(3): 515–523. PubMed Abstract | Publisher Full Text
17. Salmasi V, Maheshwari K, Yang D, et al.: Relationship between intraoperative hypotension, defined by either reduction from baseline or absolute thresholds, and acute kidney and myocardial injury after noncardiac SurgeryA retrospective cohort analysis. Anesthesiology 2017; 126(1): 47–65. PubMed Abstract | Publisher Full Text
18. Futier E, Lefrant J-Y, Guinot P-G, et al.: Effect of individualized vs standard blood pressure management strategies on postoperative organ dysfunction among high-risk patients undergoing major surgery: a randomized clinical trial. JAMA 2017; 318(14): 1346–1357. PubMed Abstract | Publisher Full Text
19. van den Boom W , Schroeder RA, Manning MW, et al.: Effect of A1C and glucose on postoperative mortality in noncardiac and cardiac surgeries. Diabetes Care 2018; 41(4): 782–788. PubMed Abstract | Publisher Full Text
20. Lilly CM, Cody S, Zhao H, et al.: Hospital mortality, length of stay, and preventable complications among critically ill patients before and after tele-ICU reengineering of critical care processes. JAMA 2011; 305(21): 2175–2183. PubMed Abstract | Publisher Full Text
21. Lilly CM, McLaughlin JM, Zhao H, et al.: A multicenter study of ICU telemedicine reengineering of adult critical care. Chest 2014; 145(3): 500–507. PubMed Abstract | Publisher Full Text
22. Thomas EJ, Lucke JF, Wueste L, et al.: Association of telemedicine for remote monitoring of intensive care patients with mortality, complications, and length of stay. JAMA 2009; 302(24): 2671–2678. Publisher Full Text
23. McDonald CJ: Protocol-based computer reminders, the quality of care and the non-perfectability of man. N. Engl. J. Med. 1976; 295(24): 1351–1355. PubMed Abstract | Publisher Full Text
24. Buschman TJ, Siegel M, Roy JE, et al.: Neural substrates of cognitive capacity limitations. Proc. Natl. Acad. Sci. U. S. A. 2011; 108(27): 11252–11255. PubMed Abstract | Publisher Full Text
25. Gaba DM, Lee T: Measuring the workload of the anesthesiologist. Anesth. Analg. 1990; 71(4): 354–361. PubMed Abstract
26. Zenati MA, Leissner KB, Zorca S, et al.: First reported use of team cognitive workload for root cause analysis in cardiac surgery. Semin. Thorac. Cardiovasc. Surg. 2019; 31(3): 394–396. PubMed Abstract | Publisher Full Text
27. Stiegler MP, Tung A: Cognitive processes in anesthesiology decision making. Anesthesiology 2014; 120(1): 204–217. Publisher Full Text
28. Sinha A, Singh A, Tewari A: The fatigued anesthesiologist: A threat to patient safety? J. Anaesthesiol. Clin. Pharmacol. 2013; 29(2): 151–159. PubMed Abstract | Publisher Full Text
29. Howard SK, Rosekind MR, Katz JD, et al.: Fatigue in Anesthesia: Implications and Strategies for Patient and Provider Safety. Anesthesiology 2002; 97(5): 1281–1294. Publisher Full Text
30. Gander PH, Merry A, Millar MM, et al.: Hours of Work and Fatigue-Related Error: A Survey of New Zealand Anaesthetists. Anaesth. Intensive Care 2000; 28(2): 178–183. PubMed Abstract | Publisher Full Text
31. Lintern G, Motavalli A: Healthcare information systems: the cognitive challenge. BMC Med. Inform. Decis. Mak. 2018; 18(1): 3. PubMed Abstract | Publisher Full Text
32. Johnson CM, Johnson TR, Zhang J: A user-centered framework for redesigning health care interfaces. J. Biomed. Inform. 2005; 38(1): 75–87. PubMed Abstract | Publisher Full Text
33. Fritz BA, Chen Y, Murray-Torres TM, et al.: Using machine learning techniques to develop forecasting algorithms for postoperative complications: protocol for a retrospective study. BMJ Open 2018; 8(4): e020124. PubMed Abstract | Publisher Full Text
34. Fritz BA, Cui Z, Zhang M, et al.: Deep-learning model for predicting 30-day postoperative mortality. Br. J. Anaesth. 2019; 123(5): 688–695. PubMed Abstract | Publisher Full Text
35. Fritz BA, Abdelhack M, King CR, et al.: Update to ‘Deep-learning model for predicting 30-day postoperative mortality’(Br J Anaesth 2019; 123: 688–95). Br. J. Anaesth. 2020; 125(2): e230–e231. PubMed Abstract | Publisher Full Text
36. Cui Z, Fritz BA, King CR, et al.: A factored generalized additive model for clinical decision support in the operating room. AMIA Annu. Symp. Proc. 2019; 2019: 343–352.
37. King CR, Abraham J, Kannampallil TG, et al.: Protocol for the Effectiveness of an Anesthesiology Control Tower System in Improving Perioperative Quality Metrics and Clinical Outcomes: the TECTONICS randomized, pragmatic trial. F1000Res 2019; 8: 2032. PubMed Abstract | Publisher Full Text
38. Murray-Torres T, Casarella A, Bollini M, et al.: Anesthesiology Control Tower—Feasibility Assessment to Support Translation (ACTFAST): Mixed-Methods Study of a Novel Telemedicine-Based Support System for the Operating Room. JMIR Hum. Factors 2019; 6(2): e12155. PubMed Abstract | Publisher Full Text
39. Kellum JA, Lameire N: Diagnosis, evaluation, and management of acute kidney injury: a KDIGO summary (Part 1). Crit. Care 2013; 17(1): 204. PubMed Abstract | Publisher Full Text
40. DeLong ER, DeLong DM, Clarke-Pearson DL: Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics 1988; 44(3): 837–845. Publisher Full Text
41. Zou J, Schiebinger L: AI can be sexist and racist—it’s time to make it fair. Nature 2018; 559(7714): 324–326. PubMed Abstract | Publisher Full Text
42. Chan A-W, Tetzlaff JM, Altman DG, et al.: SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann. Intern. Med. 2013; 158(3): 200–207. PubMed Abstract | Publisher Full Text
43. Fritz B: Perioperative Outcome Risk Assessment with Computer Learning Enhancement (Periop ORACLE) Randomized Study.2022, May 19. Publisher Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 14 Jun 2022