The application of multi-criteria decision analysis to inform in resource allocation [version 1; peer review: 1 approved, 1 approved with reservations]

Background: There is a perception held by payers that orphan products are expensive. As a result, the current health technology assessment systems might be too restrictive for orphan drugs, therefore potentially denying patients access to life-saving medicines. While price is important, it should be considered in relation to a broader range of disease-related product attributes that are not necessarily considered by many health technology assessment agencies. To overcome these challenges, multi-criteria decision analysis has been proposed as an alternative to evaluate technologies. Methods: A targeted literature review was conducted to identify the most frequently cited attributes in multi-criteria decision analysis (MCDA) in rare diseases. From the leading attributes identified, we developed a multi-criteria decision analysis framework with which to aggregate the orphan drug values. We subsequently reviewed and plotted the relationship between single attributes and the average annual treatment costs for 8 drugs used in the treatment of rare endocrine diseases. The annual treatment costs were based on UK list prices for the average daily dose per patient. Results: The five most frequently mentioned attributes in the literature were as follows: Disease severity, Unmet need (or availability of therapeutic alternatives), Comparative effectiveness or efficacy, Quality of evidence and Safety & tolerability. Results from the multicriteria decision analysis framework indicate a wide range of average annual per-patients costs for drugs intended for the same diseases, and likewise for diseases with a similar level of Disease severity. Open Peer Review


Introduction
The increasing demand on healthcare resources has created the need to minimize costs, resulting in more rigorous pricing and reimbursement pathways in most of Europe, potentially causing delays for patients to receive valuable treatments 1 . As the pressure on health technology assessment (HTA) bodies increases, so does the need to demonstrate the cost-effectiveness of orphan drugs. There is a perception among payers that orphan drugs are over-priced and therefore not cost-effective 2 . Patients have been refused access to drugs based on costeffectiveness perceptions that might possibly be linked to the application of cost-effectiveness thresholds 3 . The pharmaceutical media go to great lengths to highlight that orphan drugs are expensive interventions with costs being raised by the involvement of "big pharma" 4,5 . Therefore, orphan drugs are increasingly under scrutiny by health authorities as part of cost-containment initiatives and are frequently only reimbursed when negotiated under a patient access scheme 6,7 . The HTA processes are aimed at bridging the gap between evidence and healthcare policy and reimbursement decisions 8 . The current HTA systems adopted in many European countries may facilitate financial decisions that largely disregard disease impact on patients and a host of other disease-related attributes, and potentially ignore some unique features of innovative orphan drugs, thereby delaying or denying patients access to much needed treatments 9-11 . While the cost and budget impact of orphan drugs are important in relation to affordability, cost should be considered in relation to a broader range of drug-and disease-related attributes that are not necessarily considered in the usual HTA processes for most non-orphan drugs. This is increasingly problematic for markets linking costeffectiveness analysis to reimbursement decisions as the cost per quality-adjusted life-year (QALY) approach is not necessarily sensitive enough to capture the broader attributes of the therapy such as unmet need, disease severity, patient and societal preferences and other disease-related elements, possibly including disease rarity 12 . Disease rarity implies that the level of clinical evidence and the uncertainty surrounding the treatment effectiveness generated in clinical trials is likely to differ from that of conventional diseases due to several key factors 13 . In particular, small heterogenous populations increase the level of uncertainty of the outcomes 14,15 . In some rare diseases, there is limited natural history data and often a lack of consensus on the choice of treatment comparator and clinical endpoints 16,17 . Ethical constraints regarding placebo treatments in the control group when no other treatment options are available may arise 15,18 . Consequently, these factors might preclude a thorough analysis. Therefore, the likelihood of orphan drugs achieving the expected robust cost per QALY levels is limited 19 . Thus, the cost per QALY approach might not be optimal to assess the real value and benefits of orphan drugs and should, at least, not be the dominant tool to establish reimbursement.
In response to criticism regarding the variation in reimbursement decisions and potential inconsistent patient access to orphan drugs 20-22 , the use of multi-criteria decision analysis (MCDA) has been proposed as a viable method for orphan drug assessments. The MCDA framework provides a consistent, transparent and accountable approach for the decision-making process for orphan drugs. The MCDA approach assumes that the value framework adopts clear criteria and that the criteria are weighted appropriately 20,23 . The premise is that because the framework is based on multiple criteria, it will provide a result that reflects not only the efficacy and cost of the drug, but also wider aspects such as disease severity, unmet need, patient's health-related quality of life and target patient age groups. Techniques for MCDA enable expert panels to perform "trade-offs" between multiple aspects and outcomes of a product against the product's cost as well as a combination of different criteria 24 . The Evidence and Value: Impact on Decision Making MCDA framework was developed to support the prioritization of a broad range of healthcare interventions, with priority given to the intervention that obtains the highest rank. It has evolved and is now in its 10th edition 25 , yet it is not designed specifically for assessing the value of orphan drugs. MCDA frameworks could also inform on societal preferences, if designed to capture such information. The International Society for Pharmacoeconomics and Outcomes Research Taskforce in multi-criteria decision analysis published recommendations on MCDA, highlighting that MCDA is a tool to support decision-making by HTA bodies and payers 26,27 .
Based on our earlier research 28 , the aim of this study was to further review the literature to establish what criteria are most frequently discussed in the context of assessing orphan drugs with a view to re-testing these criteria on a group of drugs used in the treatment of rare endocrine disorders and to analyze the possible association between the key criteria and the average annual cost of each of the treatments. All the endocrine diseases included in this study are rare (maximum prevalence of 50/100,000 population in Europe 29 diseases and the drugs included in the study have orphan drug designation although some have lost marketing exclusivity.

Targeted literature review
This study entailed two parts, the first being a targeted literature review of Pubmed, Google Scholar and Google to identify any publications featuring MCDAs specifically in orphan drugs and rare diseases from January 1983 to December 2018. Search terms included "multi-criteria decision analysis", "MCDA", "orphan drug", "health technology assessment", and "quality-adjusted life-year". This time frame was chosen to capture as wide a range of studies as possible and to reflect the start of the Orphan Drug Act in the USA 30 . The search strategy was restricted to English-language studies and only those reporting on medicines in human use. Each shortlisted study was reviewed to establish what criteria had been used or mentioned in the studies. Criteria were listed by the frequency in which they were reported. The second part of the study aimed at testing the eight most reported criteria identified in the literature review by developing a MCDA framework in which to compare the average annual costs of the eight endocrine drugs included in the study with their aggregated MCDA-framework score. The drug cost calculations were based on the average dose according to the Summary of Product Characteristics, taking into account the different body weights for adults and pediatric patient cohorts. Prices were obtained from the British National Formulary (March 2019) and published price lists (March 2019) for the UK and converted to Euros at the December 2018 exchange rate 31 . A further targeted literature review was conducted to source disease-and drugrelated data to populate the MCDA framework for each drug and each criterion. In this search, we looked for the data that would be used to populate the different criteria that would be used in developing the MCDA framework. Only studies that met with the Oxford Centre for Evidence-based Medicine 32 criteria levels of 1a to 3a were included, i.e. Individual case control studies, case series and expert opinion without explicit critical appraisal were excluded. A hand-search of several health technology assessment bodies (National Institute for Health and Care Excellence, Scottish Medicines Consortium, Zorginstituut Nederland, Haute Autorité de santé) was also performed. The data for each drug and each disease was extracted and categorized in tables by each criterion.

Development of the MCDA framework
In developing the MCDA framework, we had planned to use the ten most cited criteria. However, on reflection, we reduced this to six, on the basis that Disease Rarity (prevalence) is a given when developing assessment frameworks for orphan drugs, as is the Size of the Population. The Quality of the Evidence and Level of Research Undertaken were considered to report the same data and were therefore reported as Level of Research Undertaken. Uncertainty of Effectiveness was deemed to reflect efficacy (or lack thereof) and was not used in this framework. A numerical scoring system was developed for each criterion, as shown in Table 1. For the Disease Severity, the higher score denotes a worse impact of disease on the patient. For the treatment-related criteria, the higher the score, the more marginal the benefit of the treatment is deemed to be. Overall, for the average score of each drug, the lower the score, the better the treatment is. Each criterion (and sub-criterion for Disease Severity) was scored by two people working together, and then checked by two further people to ensure that the same approach was used for all the criteria. Disease Severity is comprised of 4 sub-criteria: Life threatening or life-shortening, Severity of disease symptoms, Mental status due to the disease (anxiety, depression) and Impact of the disease on physical ability. Each subcriterion was scored from 1 to 4 (Table 1) and an average total score was calculated for Disease severity. The data for the subcriterion were based on studies and publications of the included diseases and their overall impact on patients as well as treatment guidelines to establish how treatment should expect to manage the disease and its symptoms 33-63 . The Level of Research Undertaken was scored on an adaptation of the Levels of Evidence from the Oxford Centre for Evidence-based Medicine 64 . Efficacy was scored according to the degree that the primary endpoint was met in the clinical trials ( Table 1). The scores for Safety were ranked according to the level of serious adverse events reported in the clinical trials. Unmet Need was defined by the number of other available treatments for each of the endocrine diseases. Similarly, each drug was given a score to represent how many indications the drug was licensed for in total. Criteria scores were compared with the average annual cost of the respective drug, based on list prices published in the British National Formulary for 2019.

Identification of articles and criteria
Even though this was a targeted literature review, we followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses method 65 . The literature review identified 161 publications (after removal of duplicates), of which 15 were relevant to our study 21,23,25,66-77 (Figure 1). A bibliometric analysis described the publication by origin of the primary author (e.g. academic, industry) as shown in Figure 2. First authors were from academia in 53% of the publications 23,66-68,70,73,75,77 . The first authors in 13% of the publications were each from consultancy backgrounds 72,74 , government or parastatal origins 25,76 , and healthcare positions 69,71 . A final publication (7%) was first-authored from industry 21 .
A total of 45 criteria were identified (Table 2). Geographic variation in the importance of the different criteria was observed. For example, in publications in The Netherlands and Germany, the most important criteria in MCDA for orphan drugs were Life-threatening nature of disease and Evidence of clinical efficacy and patient outcome, respectively 68 . As anticipated when focusing on rare diseases, the most frequently cited criterion is Disease severity. This is inevitable, given that many of the rare diseases are life-threatening and result in significant impact on quality of life 78 . The same applies to Unmet Need. By contrast, and rather surprisingly, Treatment convenience was only reported on by 7% of the studies despite the fact that some drugs may need to be administered regularly by non-oral routes, i.e. injection or intravenously and may impact on overall treatment costs and quality of life. In a landscape where increasingly Innovation (of new treatments) is of significant interest to HTA bodies, it's a criterion that was only reported on by 20% of the studies.
MCDA framework: Criteria per average annual cost per patient The average annual per patient cost in relation to the total drug score is shown in Figure 3 and the comparison of Disease severity score to the average annual cost per patient (Figure 4) highlights the cost differences for diseases with the same severity.
The Unmet Need is a criterion that featured frequently in the literature as would be anticipated in relation to treatments for rare diseases, yet a plot of Unmet Need and the average annual cost suggests there is no discerning relationship based on this single criterion. In contrast, the three products used to treat acromegaly all address the same unmet need, yet considerable variation in average annual cost is observed.    The analysis reviewed the relationship between average annual cost and the number of licensed indications that each respective product has. Based on the results of this small sample size of products, the results do not suggest that the average annual cost decreases as a product is licensed for additional indications.

Discussion
This study sought to assess the key criteria used in MCDA frameworks for orphan drugs and having given each criteria a score based on an ordinal scoring system, to compare the aggregated outcomes for each drug with its average annual per-patient cost without applying different weights (importance) to any criteria. We used the criteria that appeared most in the literature, although because all the diseases studied are rare diseases, the criteria Disease rarity (Prevalence) which was cited by 47% of the studies, and Size of the affected population (40%) were not included in this MCDA framework. We observed that there is room for inclusion of others. Notably, the Treatment convenience was not included in the criteria for orphan drugs. In the context of some complex orphan drug treatments that may be used for the duration of life, such as in the lysosomal storage disorders, the cost and management implications of a drug that has to be administered intravenously on a very regular basis are greater than those of a drug injected subcutaneously once a month (e.g. treatment for acromegaly) or an of a drug taken orally. Therefore, we believe that it is a criterion worth including in MCDA frameworks for orphan drugs.
The Target age (of the treatment population) is a criterion that featured only in two of the studies we reviewed. Indeed, this could be regarded to impact on the assessment of the drug in so much as it may result in absenteeism from school for children and work for parent(s) caring for ill children. Access to orphan drug treatment could have significant impact on education attainment and productivity.
The Level of research undertaken prior to licensing of a drug is likely to establish the drug's efficacy and safety with a view  to reducing the level of uncertainty. However, we question the comparability of the level of research between products. To our knowledge, no studies have suggested a methodology to score the duration of studies versus the population size of a study. For example, would a study performed in a small population sample over several years earn a higher score than one conducted on several thousand patients over less than a year? In the scope of drugs used for rare diseases, the population size of a study is frequently very small due to the low prevalence of the disease. In diseases where there is no other treatment option, it may be necessary to reduce the impact or weight of this criterion in order to facilitate access to the medication. Irrespective of the level of research, it was seen to have little correlation with the average annual per-patient cost of each drug.
While some other studies have shown an inverse relationship between Disease rarity and price 79,80 , this study was not designed to analyze this relationship. However, it provided the opportunity for a within therapy comparison of treatments for acromegaly. One might expect treatments for the same disease to have a narrow average annual per-patient cost range. Nevertheless, the average annual prices range from €617 to €20,520 demonstrating that whilst disease severity is the same for two or more drugs for the same indication, their costs may vary widely.
The criteria can be measured on various scales: binary, nominal, ordinal, cardinal or ratio 81 . For simplicity and transparency, we chose a numerical scoring system whereby the data for each criterion in the framework are converted into consistent numerical values from which an overall score is calculated. In their pilot study to test a MCDA for orphan drugs, Sussex et al. 74 used a numerical rating scale from 1 (worst score) to 7 (best score). Their rationale was that it permitted sufficient discrimination between levels for each criterion. By contrast, Hughes-Wilson et al. 21 suggested a simplistic three-level scale. One of the limitations of the model might be the application of the scores to the criteria, and the subjectivity of those scores. However, as each criterion was treated equally and scored by the same team, the differences are likely to be minimal on the overall results.
The literature review highlights some of the limitations of MCDAs. Defining the criteria at the outset is crucial to ensure that overlap between criteria are avoided. As in this study, we chose to avoid confounding the overall score by excluding Disease rarity and Size of the population. It is essential that the criteria are not selected merely to favor a preferred outcome. Ultimately, disease rarity in itself should not be the reason for paying premium prices. Other factors, such as severity of the disease, may be more important 82 . Furthermore, it is unlikely that all criteria should carry the same importance (weight) 23 , and as yet, very little work has been done to elucidate weighting preferences in a way that could best inform in healthcare reimbursement decisions. Weighting the criteria may be complicated, and dependent on the perspective of the assessment 21 .
Given that the criteria used might vary from one health economy to another, and similarly the importance of these criteria perceived differently, the MCDA framework will not necessarily lead to the same reimbursement decisions across different countries. Choices of criteria and weighting might be influenced by the health policy focus and the inclusion of the societal perspective 83 .
Since regulatory organizations, such as the Haute Autorité de Santé 84 in France, have indicated that drug innovation is valued in assessing drugs, it is interesting to observe that Level of innovation did not feature more frequently in the literature. It should serve as a powerful criterion in the judgement of drugs, especially in the current climate of priority setting, not only for orphan drugs but equally for all other drugs too.
In their work on MCDAs, Hughes-Wilson et al. 21 highlighted the need for several key criteria that should form part of an MCDA tool for orphan drugs. One such criterion is the Manufacturing complexity. Our study found this criterion to be reported by 27% of the literature review results. In the landscape of plasma-derived medicines (clotting factors, immunoglobulins) that are reputed to be more complex in their manufacturing processes 85,86 it is likely that this criterion will feature more prominently if MCDAs are to become useful assessment tools in drug assessments.
The scoring systems for each of the criteria that were adopted in this study were of the numerical form, with equal importance between grades. Despite potential criticism about the imprecision of a simple numerical scoring system, the rationale for its use is due to its simplicity and that it does not require an expert panel to adjudicate the value of one criterion against another, as would be the case in outranking methods 87 , satisficing methods 20 and value measurement methods 20 . Furthermore, it does not require the use of special computer software. Similarly, the same rationale applies to lack of using any weighting or preferences for a particular criterion, since the primary aim was to apply the key identified criteria as a tool for within therapy and across therapy comparisons. Since the methodology was applied uniformly across the criteria and drugs in the study, the outcomes provide valuable tools for comparison. However, the aim of the study was to demonstrate the trends and above all, to highlight the need for a different approach to assessing orphan drugs.

Conclusion
To date much work has been done in describing MCDAs and the criteria which could be considered core to value despite the evident lack of relationship between Disease severity and the overall drug score derived in the MCDA framework. Whilst no consensus on their design and applicability has yet been reached, MCDAs are frameworks that are worth adopting in the assessment of orphan drugs, specifically because they include disease-related and drug-related criteria that are likely to impact on patients and the healthcare system. They are not intended to replace the current HTA methodologies, but should certainly be used in conjunction to assist in decision making processes and prioritizing allocation of healthcare resources.

Data availability
All data underlying the results are available as part of the article and no additional source data are required.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes MCDA scoring/frameworks and flows directly into measurement and weighting of the criteria. On measurement, instead of simply listing this as a limitation can you provide any suggestions to future researchers on how to define the levels within criteria a priori instead of depending on past literature? Is it worth identifying criteria through engaging with patients, providers, etc. on criteria that may not be driven by regulatory approval? Meaning is there value in developing (and eventually weighting) criteria using ex ante preferences? This could aid in evidence generation, societal preference alignment, etc. Maybe that was the point you were making but didn't seem to be clear to me.
2) Related to 1), importance/weighting is quite complicated and should have it's own dedicated paragraph. Not only does an overall MCDA depend on perspective as you note in the Discussion, certain criteria may depend on perspective, i.e., clinical benefits are largely accrued by patients but cost is accrued by health systems. So it may be wise to dig in a bit more here with a separate paragraph noting these challenges. Weighting itself also comes with multiple methods that vary in bias.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? Yes