iCOMPARE, what value does it add to resident duty-hour

Discussions regarding resident duty-hour restrictions have been ongoing and heated. One influential argument for restrictions has been patient safety. Two trials, FIRST and iCOMPARE, were performed to investigate this relationship with surgical and medicine training, respectively. As the authors are approaching this discussion from a medicine-based perspective, iCOMPARE will serve as the primary basis of our discussion. Results from the iCOMPARE trial comparing flexible (28-hour shifts allowed) to the original 2011 ACGME shift requirements (maximum 16 hours) were recently published in the . New England Journal of Medicine This non-inferiority trial used 30-day post-hospitalization mortality as its primary endpoint. Results met qualifications for non-inferiority, and ACGME policy was changed to allow for 28-hour shifts for medicine residents. iCOMPARE results were highly lauded and used as primary justification for extending resident duty hours. Despite this sweeping impact, few have critically evaluated what this study actually adds to the literature. Herein, we argue that serious questions regarding trial design are apparent. Most importantly, the non-inferiority margins chosen were large, and represent an ambiguous marker of resident performance. Additionally, we question the lack of both patient consenting and direct patient-reported or patient-centered outcomes within the hospital stay. As more discussion arises in the medical literature surrounding patient-reported outcomes and shared decision making, we argue that the results of iCOMPARE disregarded the patient perspective or meaningful patient outcomes in an attempt to maintain status quo. Lastly, we discuss how iCOMPARE missed the broader question of actual duty-hour restrictions, and some practical methods already in practice at some programs, which may more directly balance resident work hours with patient care and resident learning.

Mr. C was friendly and rapport came quickly. He ultimately needed thoracentesis and during the ensuing discussion he perfunctorily asked, "How long have you been on your shift?" After replying, "12 hours," he responded, "I understand the necessity, but I want someone performing their best for this procedure. Can you wait until morning?" Fortunately, his case could be safely deferred until morning. This was neither the first, nor the last, time during my training a patient asked this question.
Recently, the New England Journal of Medicine published the final primary endpoint analysis for iCOMPARE with an accompanying editorial 1,2 . This trial sought to evaluate the impact of flexible (program directed shift lengths up to 28 hours) versus 16-hour capped shifts on 30-day post-hospitalization patient mortality. As the only trial investigating duty-hour effects on internal medicine patients, its impact is large. Indeed, in the ACGME's new duty-hour guidelines, results from the iCOMPARE and FIRST (similar study from surgical programs), are cited as a major justification for this change 3 . However, as with any practice changing study, critical appraisal of its results and application is required.
First, iCOMPARE lacked informed consent from either patients or residents. This is concerning with a primary endpoint of patient mortality 4 . One participant institution's IRB application stated that they were not involved in human research because the residency program itself was the research subject-this claim seems implausible 4 . Patients are both interested in, and concerned by, the length of their doctors' shifts 5 . Indeed, in one study, a significant number of patients stated they would like to be informed of any clinician who had worked over 12 hours 6 . Patients and residents deserve to be informed of factors affecting their care, particularly in a trial with the primary endpoint involving their 30-day mortality. For residents, this is especially concerning given their vulnerable workforce position. A 2004 lawsuit against the National Resident Matching Program asserted that residents were "forced to participate in a system that ensures they work long hours for low wages." 7 However, after a strong lobbying effort, national legislation exempted the residency match process from anti-trust laws. In this setting the potential for exploitation is high, and policies should err on the side of fair work conditions. Third, the 1% non-inferiority margin utilized suggests that 2,646 additional deaths is an acceptable justification for longer shifts. Extrapolating this margin to the U.S. population, nearly 80,000 deaths annually would be considered "non-inferior." Understandably, limitations in recruiting and study design apply, especially for non-inferiority. However, we also question the appropriateness of a non-inferiority design for this question. A proposed ethical prerequisite for using non-inferiority analysis is that an experimental therapy "must have known advantages such as reduced cost, greater convenience, or fewer side effects to justify the randomization of patients to a therapy with unknown efficacy." 8-11 Long duty-hours have been linked to several resident, patient, and community safety concerns 12-15 . Likewise, patients do not appear to feel comfortable with physicians who worked long hours 5,6 . With these concerns, we argue that iCOMPARE should have demonstrated superiority of unrestricted shift lengths. Indeed, one oft-used argument for extended shift hours is fewer handoffs and improved continuity. Inherent in the design appears to be a belief of superiority otherwise why put people through it? Furthermore, the hours worked between the two groups were the same, thus it is wholly expected that no difference in endpoint be observed 16 . Why not choose a restricted hour work week, as is done in Europe, for the comparator? 17 Or, if extended lengths of shift are believed to be better for patients, why not use this shift structure as the comparator? Lastly, we suspect superiority in patient clinical endpoints would be the only results acceptable to patients for continuing a training system of which they do not approve.
Finally, the primary endpoint of iCOMPARE is inadequately justified. Residency training should have sufficiently robust supervision that shift length has minimal effect on 30-day mortality. Orders made by an intern pass through supervising residents, attendings, pharmacists, and nurses. While errors can and do get through, post-hospitalization mortality is buffered from the source of errors in question. This is supported by one study showing an increased length of hospital stay and increased number of ICU admissions with long resident work hours, but a non-significant change in hospital readmissions and within-hospital mortality 18 . Likewise, we suspect 30-day mortality is not the only issue for which patients are concerned. Quantifying errors and near-misses made within the hospital as well as the patient experience could more directly evaluate causal links with resident performance. Additionally, while markers of resident performance (testing performance, wakefulness, etc.) were evaluated, we argue that with work hours being equal and minimal implementation of 28 hour shifts occurring between the two groups, no meaningful comparison between these groups can be derived 19,20 . With no difference in hours worked and a large non-inferiority margin of an ill-defined primary endpoint, we argue that little can be gained in regards to the preceeding data.
While the idea of flexible shift structure is sensical as policy, it severely diluted the applicability and generalizability of the results. Currently, policy extends further than data. A program could, in theory, make a majority of its rotations utilize a 28-hour shift structure and this would be within policy. However, the safety of this structure is untested, as this was not directly evaluated in iCOMPARE. Further, even if the results were to be taken at face value, the reciprocity of the results is seemingly ignored. This trial, as a non-inferiority trial, demonstrated that capped shifts, with the changes in care they bring, do not worsen patient mortality. However, what if even these are too long? Perhaps both capped and flexible structures as they currently stand with 80-hour work week restrictions are inferior to a more restricted system? We posit that this is a more pressing and sensible question.
We believe that the results of iCOMPARE are critically lacking-we can and must do better (Table 1). We do not argue in favor of 2011 ACGME requirements nor the new adjustments. Rather, we suggest that iCOMPARE itself was constructed to test an artificial argument, and did so poorly. Thoughtfully tested data is necessary to further duty hour discussions and guide Table 1. Summary of concerns and proposed changes.

Possible Changes
Lack of informed consent Residency programs were considered the study "subjects." Patients and residents were not required to give consent. This reasoning is flawed 4 .
-In trials investigating patient safety, patients should be informed 4 .
-Residents likewise should have been informed and consented as participants in the trial. Medical students should be made aware of programs participation pre-MATCH and current residents should be consented and involved in decisions to participate.

Culture
-"I survived residency living at the hospital, so should everyone else." -Belief that the old way of training is the best way of training.
-Medical practice has changed, with some change to trainees as well.
If this is ignored, relationships between program leadership will suffer.
-Cultivate a culture supportive of resident concerns and needs.
-Collaborate with residents for flexible solutions.
-Improved work culture may lead to a better culture of care for patients (more compassionate, empathetic, and emotionally engaged).

Exploitation
-Residents are a cheap labor force.
-Restricting duty-hours would likely require hospitals to expand coverage.
-Bias of avoiding expanded coverage is apparent in iCOMPARE conclusions.
-The data support similar patient safety for both duty-hour groups. However, the conclusions favor the flexible group. -These data suggest that restricted duty-hours are likewise non-inferior to flexible/ extended hours.
-Allow resident input into schedule design. Some services may benefit from different structures and schedule requirements. In this regard we agree with the ACGME's decision to move to flexible program driven shift-length schedules -Reasonable duty-hour restrictions that allow programs the flexibility to apply changes in a less onerous way.

Non-inferiority
design -There is little justification for choosing a non-inferiority design when existing data suggests the active "treatment group" is potentially dangerous.
-Work hours were virtually the same between groups, so no real difference in patient outcomes is to be expected. -The bias was in favor of extended duty-hours. If one wants to argue that extended duty-hours are preferred, with associated physician and patient risks, superiority should be demonstrated.
-Future studies should compare restricted duty-hours (i.e., 60 hours), as this strikes at the heart of the issue.
Obscure outcome -30-day patient mortality is not ideal for studying resident duty-hours.
-Intern orders pass through residents, attendings, pharmacists, nurses, and the electronic record, inherently limiting significant mortality-altering errors.
-No resident safety factors were followed (needlestick injuries, wellbeing, post-call motor vehicle accidents, etc.) -Resident safety outcomes such as needlestick injuries, motor-vehicle accidents, and well-being/mental health disorders should be tracked.
-Patient experience should be tracked -In-hospital errors and near-misses should be tracked.
Lack of concern for resident well-being -Previous results from iCOMPARE resident surveys demonstrate resident dissatisfaction with the flexible group 19 .
-Resident safety outcomes were not tracked.
-Conclusion in favor of flexible group despite no superiority in results and increased concern for resident harm. -4+1 scheduling (1 week of outpatient care with at least 1 full weekend for every 4 weeks of inpatient care).
-Robust resident wellness programs.
-Variation in call length and day length on inpatient rotations. At least one in every 4 days should be a "golden day" with no admits and the ability to leave after rounding to complete work.
further policy refinement as well as an openness to consider new paradigms. Let us work to cultivate a group of compassionate, empathetic, and well-trained physicians; armed with not only with technical prowess, but the emotional reserve to respond to and alleviate patient suffering. In contrast to this goal, the iCOMPARE study is sadly a self-fulfilling prophecy-a superficial justification to continue the status quo.

Data availability
No data are associated with this article.