Measuring indicators of health system performance for palliative and end-of-life care using health administrative data: a scoping review [version 1; peer review: 2 approved with reservations]

Background: A plethora of performance measurement indicators for palliative and end-of-life care currently exist in the literature. This often leads to confusion, inconsistency and redundancy in efforts by health systems to understand what should be measured and how. The objective of this study was to conduct a scoping review to provide an inventory of performance measurement indicators that can be measured using population-level health administrative data, and to summarize key concepts for measurement proposed in the literature. Methods: A scoping review using MEDLINE and EMBASE, as well as grey literature was conducted.  Articles were included if they described performance or quality indicators of palliative and end-oflife care at the population-level using routinely-collected administrative data.  Details on the indicator such as name, description, numerator, and denominator were charted. Results: A total of 339 indicators were extracted.  These indicators were classified into nine health care sectors and one cross-sector category.  Extracted indicators emphasized key measurement themes such as health utilization and cost and excessive, unnecessary, and aggressive care particularly close to the end-of-life.  Many indicators Open Peer Review


Introduction
Growing health care costs in an environment of tight financial constraints and an aging population are challenging many health care systems globally. Recent health care reform initiatives have underscored the importance of providing quality care to patients at all phases of the disease trajectory, including palliative care (PC) both specialist and generalist, and end-of-life care (EOLC), to improve patient and health system outcomes 1,2 . The Worldwide Palliative Care Alliance and the World Health Organization recommend provision of PC for all persons with chronic and life limiting/threatening conditions early in the illness trajectory 3 . However, universal access to quality, integrated and timely PC for patients remains intermittent at best, with PC largely being provided close to the end-of-life (EOL) 4 . Moreover, concerns about high costs associated with health care at the EOL are prevalent in the literature; studies attribute these costs to intensive, aggressive, and sometimes unnecessary utilization of institutionalized care, such as inpatient hospital services in the last few months of life 5 .
Monitoring health system performance is one important component of evaluating achievement of quality care for improved patient and health system outcomes. This can be done through health system performance indicators, which summarize directional quantitative information on the quality of health, often evaluated through measurement of structure (inputs) or characteristics of the health care system, process (outputs) or services required to provide health care, and outcomes or measures of ultimate impact of the health care provided 6 . Health system performance indicators allow comparisons across jurisdictions, organizations, and/or administrative databases to track progress over time in efforts to improve health care quality 7 . Quantitative information for indicators can utilize health administrative data or data that is routinely collected in the process of delivering health care programs and services, often used due to its extensive reach (many databases are population-based), low cost, and low respondent burden 8 . Administrative data can also enable contextualization of health system performance indicators due to availability of data on patient sociodemographic and clinical characteristics.
In recent years, jurisdictions in Canada, the United States, and the United Kingdom have focused on defining standards and monitoring health care quality as efforts to expand PC programs intensify 9-14 , and as a result a plethora of health system performance indicators for PC and EOLC using health administrative data have been proposed 13,15-18 . These indicators have been suggested based on differing objectives (e.g. quality improvement, performance measurement) and/or for different patient populations (e.g. cancer, intensive care units, long-term care (nursing home) patients) 13,19-24 . Given the diversity of indicators in the literature, a scoping and cataloguing exercise becomes important in order to consolidate and summarize common measurement concepts (themes) and indicators that can be recommended for use in the performance measurement of PC and EOLC. Such a catalogue can serve as a reference guide for use by policy and decision makers, thereby reducing potential redundant efforts on collecting indicators, and enabling movement to the next step of decision making on what to measure and how best to measure it given each jurisdiction's individual data systems. Furthermore, categorizing indicators by health care sector or setting of health care delivery can be a helpful demarcation that enables measurement of health system performance to align with funding and accountability structures 25 . Consideration of health care setting or health sector (e.g., acute care hospitals, long-term care (LTC), home care) becomes important given each sector's potentially differing care quality aims, target populations, health care processes, and definitions of outcomes. Such organization does not preclude inclusion of indicators of transitional care or coordination of care between health care settings, but rather enables tailoring of indicator definitions that are more reflective of the relevant sector or health setting's contribution to system-level performance.
As such, the primary objective of this study was to create a catalogue or inventory of health system performance measurement concepts and indicators for PC and EOLC utilizing routinely-collected population-level health administrative databases and as categorized by health sector by conducting a scoping literature review.

Methods
This scoping literature review of health system performance indicators for PC and EOLC followed the Arksey and O'Malley (2005) methodological framework for scoping reviews 26 . Scoping reviews are a "type of knowledge synthesis, (that) follow a systematic approach to map evidence on a topic and identify main concepts, theories, sources and knowledge gaps" 27 . The primary objective of this scoping review was to collect and map health system performance indicators for PC and EOLC using routinely collected population-based health administrative data and to organize indicators by health care sector. Since this required consulting a broad array of literature to uniformly collect indicators, rather than appraisal of published indicators, this objective aligned well with the rationale for the use of scoping review methodology in comparison to systematic review methodology, and hence a scoping review methodology was chosen to complete research objectives 28 . A scoping review protocol was not published a priori for this study. Subsequent sections describe specific steps taken for this scoping review in accordance with the Arksey and O'Malley (2005) methodological framework.

Identifying relevant studies for inclusion
We collected both peer-reviewed and grey literature published in English for this scoping review. For peer-reviewed literature, we conducted a search of MEDLINE and EMBASE databases between years of 2010 to July 2018. A University of Toronto librarian and comparable literature reviews were consulted in the development of the search strategy, which included a combination of the United States National Library of Medicine Medical Subject Headings (MeSH) 29 and keywords (see Extended data: S1 Appendix Search Strategy 30 ). For grey literature, we utilized Google search engine using a combination of keywords such as "palliative care", "end-of-life-care", "performance indicators", and/or "quality indicators", as well as incorporated reference documents made available to us through our knowledge user partners based on their previously conducted literature searches. No jurisdictional restrictions were placed for the grey literature search; however Canadian references were generally better known by study authors and knowledge users, and formed the bulk of included references.
Inclusion/exclusion criteria for study selection Inclusion and exclusion criteria were developed in accordance with the study objectives and points of inquiry for the review, notably: a) focused on core concepts of quality or performance measurement in palliative and/or end-of-life care, b) measurement conducted using population-level, routinely-collected data, and c) studies with a health system focus. The full set of generated articles was reviewed based on developed inclusion and exclusion criteria (see Extended data: S2 Table Study Inclusion and Exclusion Criteria 30 ).
Reviewers (SB and AG) independently screened all articles using titles and abstracts, with any conflicts resolved through discussion and inclusion and exclusion criteria accordingly updated. This generated a total of 285 articles for full text review ( Figure 1). Full text review was also independently conducted by SB and AG, with a total of 54 peer-reviewed studies and 42 grey literature documents being identified for indicator extraction. Additionally, forward reference searching of reference lists was conducted to include any additional relevant articles, as well as any other relevant studies known to the all study authors. Studies outside of the study time period of 2010 to July 2018 were included. This resulted in a total of 32 additional articles included into the study for data extraction.
Charting and summarizing the data A data extraction tool was developed to effectively chart details on presented and potential indicators (such as indicator definition, numerator and denominator if available). Indicators were extracted if they a) measured health system performance for PC and EOLC, and b) used population-level data using health administrative datasets. Once the full set of indicators were charted, SB, AG and PT removed duplicates and conducted health sector classification. We chose to focus on categorizing our collected indicators by health sectors specifically for the Canadian province of Ontario, with a population of over 14.5 million residents 31 with universal health coverage for costs associated with acute care, hospitalizations, physician visits, emergency room visits, long-term care, home care, complex continuing care, and medications for those meeting select agebased and need-based criteria 32 . Health administrative data is also collected comprehensively and at the population-level for the majority of health sectors with public health coverage, thereby increasing potential for performance measurement with indicators based on routinely-collected administrative data. For the purposes of our study, indicators were classified based on the following Ontario health sectors based on how data from health administrative databases is currently obtained and organized in Ontario administrative databases: hospital care (including emergency department (ED)); home care or care services provided in the home and community; 33 . LTC (i.e., nursing homes), hospice care, physician services or care services captured through physician billing codes; medications covered through the public health insurance system; 34 complex continuing care (CCC) or technology-based care provided to patients with chronic and complex health conditions; 35 cancer care or care specifically targeted for cancer prevention and treatment; and other. A category entitled "Cross Sectors" was created to capture indicators transcending more than one health sector (e.g. place of death at various care locations). Further thematic analysis was conducted collectively by the study authors in discussion to group indicators measuring similar constructs by common themes, leading to the creation of measurement themes for indicator classification under each health sector category.

Consultation exercise
Following this initial categorization, a working group with subject matter experts (both researchers and clinicians) was organized to review the final list of indicators. Indicators were cross-checked to ensure a) all relevant concepts related to PC and EOLC had been captured, and b) all indicators were conceivably measurable using (Ontario/Canadian) health administrative data based on the knowledge and expertise of the subject matter experts present.

Studies reviewed
A total of 1111 indicators were extracted from 128 articles. 722 indicators were excluded (due to reasons of duplication and irrelevancy), resulting in a total of 339 indicators included for summarization ( Figure 1). Studies and grey literature ranged from publication dates of 1992 to 2018, with only 5 included studies published prior to 2003, and the majority (n=82) being published after 2008 ( Table 1).
The 42 grey literature documents included publications from Canadian provinces (e.g., Ontario, Alberta, Saskatchewan) (n=31), where routinely collected administrative data is readily available, and from the United States (n=6), United Kingdom (n=4), and Australia (n=1). They included documents generated by PC delivery organizations, such as local home care agencies, and also jurisdictional collaborative efforts to improve care.

Categorization of indicators
Ten health care sectors were utilized to categorize indicators, with sector categories of Cross Sector, Cancer and Home Care having the greatest number of indicators (Table 2). Table 3 presents a summary of indicators and key measurement themes by health care sector, and a collective list of all indicators collected can be found in Extended data: S1 Table Detailed List and Information on Collected Indicators 30 . These findings are discussed next.

Cross sector
Within this review, a total of 58 collected indicators were categorized as transcending sectors or cross sector 9,13-18,20-24,36-106 and four measurement themes were identified. Indicators classified under the cross sector category tended to focus on how well palliative and EOL patients were being cared for through examination of aggressiveness of care, place of care/death, availability of palliative care services and overall cost; this measurement was to better understand how the overall health care system was performing in the care of palliative and EOL patients, rather than on the performance on any specific health care sector by itself (e.g. the enrollment of patients in a palliative care program and how early this enrollment occurred).

Medications (n=26)
Cost Drug costs for the patient, and costs incurred by the government, per prescription. 9,18,40,46 Symptom Management Determined if symptoms were managed by the most appropriate medications (and for such specific disease types), such as short-or long-acting opiate within 60 days before death. Medications evaluated included opioids, narcotics, antipsychotics and anxiolytics. Also, overmedication and number of prescriptions, as well as aggressive treatment was also included, such as the prescribing of antibiotics at the EOL.

Complex continuing care (CCC)
A total of three indicators were measured within the CCC sector that typically focuses on rehabilitation and/or PC. They focused on pain management 108 , access to PC 58 , and cost 68 . This sector was the most underdeveloped in the literature compared to other sectors with respect to performance measurement likely due to its uniqueness to the Ontario context.

Other
Three indicators did not fit the sectors above. They were ambulance use by EOL patients 56,87 , medical lab services and equipment expenses 144 , and diagnostic testing at the end-of-life 133 . These indicators were classified together under the "Other" sector category.

Discussion
The purpose of our scoping review was to collect, organize, and share a distilled inventory of sector-specific health system performance measurement indicators for PC and EOLC. Our scoping review revealed 339 indicators, organized across nine distinct and one cross-cutting health care sector categories. Indicators within each sector category were subgrouped by key measurement themes. Collectively, these indicators represent the field's thinking on how best to measure high health system performance in the delivery of both PC and EOLC.
One of the most commonly occurring measurement themes across sectors focused on health system utilization (e.g., hospital length of stay, ICU admissions, chemotherapy or home care received). Indicators of utilization were collected both for comparison across palliative/end-of-life and non-palliative/nonend-of-life populations (to represent access), to examine cost of delivered care, and also to understand how well the system was performing in the delivery of care (e.g. on wait times). However, emphasis on utilization through measurement of process indicators, such as in the hospital sector, highlighted that measurement themes such as patient experience and effectiveness of palliative care were not captured. Measurement of such themes through indicators can serve as an indication of the achievement of positive patient and health system outcomes.
Another frequent measurement theme across sector categories was the aggressiveness of care at the EOL for palliative and EOL populations. This theme included indicators of inappropriate treatment, medications, and transitions at the end-of-life that would be avoided in a well-performing health system focused on quality patient care. As health system resources become finite, this measurement theme will likely gain prominence in not only helping to improve patient outcomes, but also in reducing costs from unnecessary and aggressive health services.
Despite the wealth of indicators collected, the measurement of patient-centered or patient-reported outcomes was infrequent at the population-level using health administrative databases. Measurement of themes related to unmet care and self-management support needs, advance care planning, goals of care, consideration of patient preferences, and patient and caregiver burden were limited. However, absence of indicators from the review does not mean that measurement is not occurring, but rather, that it may not be occurring in populationlevel health administrative databases. Given the importance of these indicators, measurement through primary data collection may be required to obtain a comprehensive picture of health system performance.
Additionally, the literature review revealed that although there were a large number of indicators, many of these indicators measured similar constructs, but with different specifications. This includes variations in identifying the time period of measurement and the patient population. There were also differences in how debated concepts in the literature were operationalized such as what constitutes markers of quality care (example, how to define a burdensome transition). S1 Table (Extended data 30 ) includes each of these collected indicator iterations across various data sources in more detail. The multiple variations of indicators -with little justification of why they were chosen -posed challenges in summarizing the data, and in recommending how these indicators should be measured. This difficulty was compounded by the absence of information on how indicators were operationally defined, including lack of information on the numerators and denominators in many of the extracted studies. As such, subsequent iterations of similar indicator themes resulted in issues of comparability across jurisdictions. While noting differences in data availability, efforts are needed to systematically define a set of standardized indicators for use across jurisdictions. There have been efforts by some stakeholders to implement accepted performance measurement frameworks 98,151 . Subsequent efforts can then be aimed at improving the construction and measurement of these indicators, and also in continuing efforts to evaluate system performance.
Limitations specific to the methods and results of this scoping review exist. Indicators were collected based on their ability to be extracted from a population-based data source, dependent on the judgement of study authors and subject matter experts which may have introduced some bias (e.g. large focus on Canadian grey literature). As one of the reasons for this scoping review was to better understand which health system performance indicators currently exist for local efforts of policy planning, a bias towards indicators that can be readily collected in the Canadian and/or Ontario context may exist. Moreover, quality appraisal of indicators was not conducted. Next, the conflation of the terms, palliative, Overall, this study reviews -by health care sector -populationlevel health system performance indicators for PC and EOLC that can be measured through administrative databases. Although a large number of indicators have been reported for each sector, these indicators are often variations on the same theme, reflecting a lack of consensus on key debated concepts within the PC and EOLC literature. Future work is needed to achieve consensus 'best' definitions of these indicators as well as a universal performance measurement framework, similar to other ongoing efforts in population health 121 . Our scoping review will reduce duplication of the extensive amount of work that is required when a jurisdiction wishes to make a concerted effort to improve care in palliative and EOL populations through adoption of a performance measurement framework; one of the first steps in such efforts is typically to collect performance indicators from literature. This review can instantaneously inform indicator selection and development for other local, national and international efforts currently underway to improve PC and EOLC. Such performance measurement through indicators can help identify gaps in the access and quality of care and evaluate the impact of PC interventions that aim to bridge these gaps.

Data availability
Underlying data All data underlying the results are available as part of the article and no additional source data are required.

Veerle Piette
End-of-Life Care Research Group, Vrije Universiteit Brussel (VUB), Brussels, Belgium This study provides a useful and welcomed overview of EOLC quality indicators that have been developed or used in published studies.
Overall the presentation is clear, but we have a number of comments and suggestions the authors may wish to address.
Rationale for the Canadian perspective versus the conclusions about comparability across jurisdictions: the emphasis on Canada is strongly integrated in the article, both in the choices for the classifications of health sectors used to classify the indicators and in the analysis criterion for "conceivably measurable using (Ontario/Canadian) health administrative data". That seems to be somewhat in conflict with the conclusions and plea the authors seem to make about comparability across jurisdictions.

1.
We endorse the call for cross-national comparability. However, we would welcome a bit more reflection about the plea for more comparability across jurisdictions. There are likely several reasons for variations, including data availability but also different discourses around quality care. It is not so unsurprising then that experts within one jurisdiction facevalidate indicators in a different manner than those in another. What would be needed to develop these? What rigorous efforts, using what methods, would the authors suggest for such a process? Our own experiences in cross-national comparisons learn us that issues around measurement equivalence, but also conceptual equivalence are a huge challenge.

2.
Aims: the aim includes measurement concepts AND indicators, whereas the results talk about indicators. This creates confusion about the concept of 'measurement concept' and how it differs from the indicators. Either it needs some operationalization, or, alternatively, you could remove it from the aim.

3.
Methods: literature search: the search strategy and inclusion/exclusion criteria were well done, are provided in an online format and were described precisely. It is easy to access the files through the citation to the data repository. However we have three comments: First, why is the search limited to July 2018? As this is a quickly expanding field this seems like an important limitation. What is the risk of missing important recent development efforts? Second, no validation strategy for the search string was followed and this could be mentioned as a limitation. Third, inclusion criteria for study selection are OK but there is no mention for the criteria for indicator selection. Authors remove 722 indicators for irrelevancy, but based on what criteria? Reasons for removal would also be useful in the flowchart.

4.
Results: Quality assessment: the authors did not do a formal quality assessment for the studies being reviewed. Knowing the heterogeneity of included studies/literature, we can understand the practical difficulties in using existing quality assessment tools. However, as a reader we would benefit from at least some reflection or discussion about the methodological quality and rigor in indicator selection (eg convenience selection vs. formal validation efforts? What methods are used?). It would also be very helpful to have an overview of the level of scientific evidence underlying an indicator. It is likely that for most indicators the evidence-level is expert opinion and that only a limited number are based on evidence of causal impact on quality of life or related concepts. As such an overview would probably present demand a huge undertaking it is perhaps something that the authors may wish to stipulate as an attention point for future research?

5.
Discussion: There is quite a strong emphasis on indicators being sector-specific. Not sure if this is accurate as the idea is not to use these indicators within a certain sector. We believe that for many of the indicators there is a system-wide responsibility to assess and improve quality. For instance, referral to home care is the responsibility of not only the actors within home care but also those outside that sector. Careful not to suggest an even stronger echeloning of health care. 6.
Discussion: The authors focus on the fact that the measurement of patient-centered or patient-reported outcomes was infrequent at the population-level using health administrative database. This is an important point but we think it may be important to provide readers with more discussion about why this is so (not routinely collected?) and what would be needed to move forward to allow for inclusion of such indicators.

7.
Discussion: strengths and limitations: would be good to have a separate paragraph discussing these. Also, some of the limitation mentioned in the points above may need to be included.

8.
A minor point about the classifications: it seems that you approached the classification of indicators both deductively (i.e. using Ontario's health sector classification) and inductively (i.e. based on the types of indicators you found. The inductive approach is a result from your review (and should also be presented as such) and we would welcome more insight about how you came to the classification.
"One of the most commonly occurring measurement themes across sectors focused on ○