A Discrete-Choice Experiment to capture public preferences on the benefits of home-visiting programme for teenage mothers [version 1; peer review: awaiting peer review]

Background: Complex health and social care interventions impact on a multitude of outcomes. One such intervention is the Family Nurse Partnership (FNP) programme, which was introduced to support young, first-time mothers. Our study quantified the relative values that the general public place on the outcomes of FNP, as they were identified and measured in the relevant randomized trial, Building Blocks trial (BBs). Methods: A discrete choice experiment (DCE) was employed. Respondents chose between two scenarios describing hypothetical sets of trial outcomes. BBs compared FNP care for teenagers expecting their first child with standard NHS care. 14 attributes covered three areas: pregnancy and birth, child development and maternal life course. Due to large number of attributes, a “blocked attributes” approach was adopted: the attributes were split across four designs which contained two common attributes. Data were analysed separately for each design as well as pooled across four designs. Random effects probit model was employed for the analysis. Results: Over 1000 participants completed four designs. The analyses on the separate designs and those on pooled data yielded broadly similar results. Respondents valued higher the outcomes related to child development and their needs, followed by the outcomes related to maternal life course. Preferences varied by the age of the respondents but not by their guardianship/parentship status. Conclusions: Individual preferences were consistent with a priori expectations and were intuitive.  The DCE results can be used to incorporate the general public preferences into the decision making process for which public health and social care policies should be Open Peer Review Reviewer Status AWAITING PEER REVIEW Any reports and responses or comments on the article can be found at the end of the article. Page 1 of 13 F1000Research 2020, 9:677 Last updated: 18 AUG 2020


Introduction
Becoming a mother at a young age is associated with disadvantages for both the mother and her baby [1][2][3] . The spectrum of possible disadvantage includes poorer birth and developmental outcomes for children [4][5][6] to reduced life chances in terms of education and employment opportunities for younger mothers 7,8 .
The Family Nurse Partnership (FNP) is an intervention designed to improve outcomes for young mothers and their children by supporting families on issues related to health during pregnancy and parenting skills, and on life skills including how to become self-sufficient. The programme has been supported by the Department of Health in many sites in England since 2006 9 and was evaluated via a randomized controlled trial 10,11 . The Building Blocks (BBs) trial evaluated four primary outcomes which were birth weight, tobacco use in pregnancy, emergency attendances (A&E) and/or admissions of children within two years of their birth, and the occurrence of subsequent pregnancy within two years of first birth. There was also a range of secondary outcomes, both health and non-health related, which are fully described in the trial protocol 10 and report 11 . The medium-term impact of the intervention is being evaluated through routine data linkage study 12 The level of success of this intervention, as measured by the outcomes of this trial and the economic evaluation, helps inform commissioners' decisions on the future of the programme. The nature of the outcomes (health and non-health related), as well as the subject of interest (mother or child related outcomes), add new dimensions to the economic evaluation of such a trial. A cost-utility analysis will present, for example, the cost per mother's Quality Adjusted Life Years (QALYs) but will fail to include the other outcomes 13 . A cost-consequences analysis approach reflects the fact that a complex intervention is associated with an array of health and non-health benefits that cannot be measured in a common unit 14 . Given the disparate nature of relevant outcomes, and the fact that outcomes moved in different directions, it was deemed necessary to establish some relative value (or trade-off) between these outcomes. In addition, given that the public involvement in health-care decision making is an objective of the health care policy makers in the UK 15,16 , it is important to elicit the preferences of the general public over the different outcomes of the trial, i.e. how the general public weighs up these outcomes. The findings of the randomized trial together with the valuation of the outcomes, as measured in this exercise, provides decision makers with useful knowledge that may inform the judgment about whether the intervention should be adopted or extended on a larger scale.
The aim of this study was to quantify the relative values that the general public place on outcomes/domains examined in the BBs trial using a discrete choice experiment (DCE). The DCE approach is a well-established method to explore preferences in health services research 17,18 . The approach is based on asking respondents to state their preferred option between two or more hypothetical scenarios which describe a service or a set of outcomes. The hypothetical scenarios include the descriptors (or attributes) the levels of which can be altered in each choice task. With a sufficient number of scenarios and respondents, it is possible to model the impact of specific descriptors on the choice of the respondents. As the individual considers simultaneously a number of attributes, DCEs are thought to overcome some of the problems that other choice methods, such as ranking, rating or standard gamble, pose 19,20 .

Methods
The DCE was designed as a web based survey, following the general guidelines and steps for conducting a DCE 17,18,21 .

Selection of outcomes and experimental design
The selection of the DCE attributes and levels drew on the outcomes that were measured in the BBs trial. The trial focused on three areas of research: pregnancy and birth, child health and development, maternal life course and self-efficacy. There were four primary outcomes: tobacco use at late pregnancy (34-36 weeks' gestation), birth weight, emergency attendances and hospital admissions for the infant within 24 months of birth, and the proportion of women with a second pregnancy within 24 months post-partum. In addition, a large number of secondary outcomes (over 60) were measured across the three aforementioned domains. All four primary outcomes and ten secondary outcomes were included as DCE attributes. The secondary outcomes were selected via a voting process by the team that designed the study protocol of the randomised trial, based on what they perceived as most important efficacy measures.
All the attributes, with the exception of mother's health, had two levels (Table 1) describing whether one outcome was positive or negative. This binary format was chosen with the aim to keep the task simple and manageable for the respondents. Mother's health was described as a four-level attribute, corresponding to the following 3-level EQ-5D health states: 11111, 11112, 11211 and 11212. The focus was only on two of the five domains of the EQ-5D questionnaire 22 : usual activities and anxiety & depression, as these were considered most likely to be affected in this group of an otherwise healthy population. Further, the attribute was transformed into a continuous variable 23 by applying the relevant utility scores i.e. 1, 0.848, 0.883 and 0.812 respectively, based on the UK tariff 24 .
Due to the large number of attributes, the experimental design was based on the "blocked attribute" approach 25,26 . The method allows for a large number of attributes to be allocated across more than one design with overlap: some attributes are treated as "common attributes" and appear in all the designs, while the non-common attributes will appear in separately in each design. The fundamental assumption is that the respondents will have the same preferences over the whole set of attributes if they are presented together compared with if they are presented in blocks 25 . The analysis is based on the "pooled" data from all the designs. The chosen attributes were split across four designs ( Mother has no problems with mobility, self-care, usual activities, has no pain or discomfort & is moderately anxious or depressed =1 Mother has no problems with mobility, self-care, has some problems with performing usual activities, has no pain or discomfort & is not anxious or depressed =2 Mother has no problems with mobility, self-care, has some problems with performing usual activities, has no pain or discomfort & is moderately anxious or depressed =3

Confidence
Mother was confident that she could achieve her goals & believed she could solve problems well=0 Mother was not confident that she could achieve her goals & believed she could solve problems well=1 with two common attributes based on the primary outcomes of the trial: one primary outcome related to the mother and one related to the child. The rest of the attributes were allocated to the four designs, attempting to achieve a balanced representation of mother and child related outcomes in each design.
The designs were constructed in SAS software by using the D-efficiency criterion 27 . The D-efficiency based designs are the result of the trade-off between degrees of orthogonality and balance, where one or more parameters cannot be estimated when D-efficiency is 0 or the design is perfectly balanced and orthogonal when D-efficiency is 100. Fifteen choice-sets were constructed for the designs A-C and sixteen choice-sets for the design D. These were fractional factorial designs, with only the main effects included. The D-efficiency was 98.96% for designs A-C and 83.91% for design D.
Once the choice-sets were constructed, qualitative interviews with 15 members of general public were conducted at the Centre for Trials Research, University of Cardiff with the aim to establish whether: • The respondents understood the choice context and the task in general • The choice of attributes and levels was appropriate • The complexity of the task affected the completion of the choice task • The length of the survey was likely to affect the response rates Two rounds of cognitive interviews were conducted. During the first round, a questionnaire was devised using a sample of scenarios from each questionnaire and an extra scenario that simplified the attribute about mother's health. These questionnaires were tested on six participants. During the second round, four complete questionnaires were used (designs A to D 28 ) and were tested on eight participants.
The feedback from this piece of work resulted in a few changes to the wording of the original attributes, but no changes in the experimental design.

Participant recruitment and choice task
The study was a web based, self-completion survey to generate good response rate whilst remaining relatively inexpensive. A market research company (Dynata, former Research Now) undertook the respondents' recruitment, data collection and handling. The survey was translated into an online, web-based questionnaire which could be accessed by the respondents via their computers. Members of the UK general public, across various geographical locations, who were ≥18 years old were invited via email and text messages to respond to the survey and reimbursed GBP 2.5 for their participation. Probabilistic sampling was utilised with the aim to achieve populationrepresentative sample from the online panel of the market research company. Data were collected on a number of sociodemographic characteristics to assess whether these characteristics have an impact on the choice of the preferred outcomes.
Four groups of respondents were randomly allocated to complete one of the four designs (A to D) and a fifth group completed all four designs. A minimum of 70 respondents per design was estimated based on the following formula 29 : Where: c is the number of analysis cells, or the largest number of levels for any attributes, for a design which includes only main effects, t is the number of choice tasks and a is the number of alternatives.
Following an introductory, informative page, respondents were asked to choose the most preferred scenario out of two in each choice set describing possible outcomes of the trial.

Data analysis
Demographic characteristics of the sample were compared with the general UK population (% representation for each variable) to gauge whether the sample was a good representation of the general population. Data from the DCE were analysed using a random effects probit model in STATA (version 12). This is one of the conventional approaches for analysing DCE data 17 . The models were estimated on the data from each separate design as well as on the pooled data, with and without interaction with demographic variables.
The model used for the analysis of the data from the each separate DCE design was: where ΔY is the change in utility from choosing scenario A over B (left -right scenario), d_Attribute_1 to d_Attribute_5 represent the difference in the levels of the five attributes between the option A and B in each choice set. α 0 , α 1 -α 5 are the coefficients to be estimated, v are the random effects and ε are the independent and identically distributed error terms. The variable indicating the "Choice" of scenarios was coded as 0 if scenario B (right) was chosen and 1 if scenario A (left) was chosen. This resulted in a positive difference 1 if the left scenario included the attribute as a "negative outcome" and in a negative difference -1 if the right scenario included the attribute as a "negative outcome".
The data from the four designs were pooled, where all the attributes were included. For pooled data model, "no difference" in attribute levels for the omitted attributes was assumed, and so the variables in the analysis were zeroed out.
Several separate regression models were estimated. The first model included only the attributes (i.e. no dummy variables for the different designs or interaction terms with the socioeconomic variables). A separate model was fitted to the pooled data by adding dummy variables for each design. Another model was fitted by including a dummy for the respondents who completed all four designs. Additional models examined how preferences varied according to respondents' demographic characteristics (i.e. age, gender and parenthood/guardianship status). These models included the fourteen attributes, pooled data from four designs, and interaction term variables.

Ethics approval and consent to participate
Ethics approval was obtained from the Research Governance Committee at the Department of Health Sciences, University of York.
Participants consent was inferred once the participant read the information sheet 30 and proceeded with the questionnaire completion.

Results
Sample demographic characteristics 207 respondents completed all four DCE designs (A+B+C+D), 200 respondents completed one of the A, C or D designs, and 201 completed design B. The sample achieved a good representation of the general adult UK population for all four categories of age, home ownership and marital status but not for gender, as there was an over-representation of female participants (Table 3).

The models
The results from each design separately, as well as the results from the pooled data, with and without including design dummy variables, are presented in Table 4. The design-specific dummy variables are not significant, suggesting that there were no unobserved differences between the samples that completed each design or between the designs. The differences in respondents' characteristics might have been accounted for by the random allocation of the respondents across designs.
The results of the model on the pooled data, without including the design dummies are discussed here. Given the coding of the binary attributes (Table 1), the negative signs of the coefficients would suggest that respondents' utility decreases as the level of the attribute increases, or becomes a negative outcome (0= positive outcome, 1=negative outcome). Positive signs would indicate the contrary i.e. respondents derive higher utility as the level of the attribute decreases (i.e. becomes a "positive outcome"). All the regression coefficients have the expected signs and are statistically significant, indicating that the respondents prefer a programme that would improve the outcomes described in this exercise. The results are in line with the a priori expectation that positive outcomes related to vulnerable, pregnant women and their children are preferable to negative outcomes.
The highest ranked coefficients for the separate designs are the five top highly ranked attributes in pooled data model, showing a consistency of results across models. The relative ranking of the two common attributes (second pregnancy and A&E attendances) is the same in all the models: A&E has a larger coefficient than occurrence of second pregnancy, showing that it is always considered a more important outcome than the occurrence of second pregnancy. In the pooled model, the largest coefficients are mainly related to outcomes that affect children directly: whether the mother fulfils the child's needs, whether there is a bond between mother and child, A&E attendances for the child and the language development. Therefore, one could conclude that the outcomes related to the child are valued higher than the outcomes that are related to the mother.

Subgroup analyses
The models based on interaction effects with demographic characteristics (Table 5) demonstrate that preference for the occurrence of child's A&E attendances, education and employment of the mother and whether the child's needs are met, varied by gender. The preferences between genders change in terms of their magnitude rather than the nature of the outcome. Women have higher disutility than men if the above attributes take the "negative outcome" value, i.e. the child attends A&E, the mother does not return to education or employment, and the child's needs are not met.
Preferences varied by age with significant differences observed for A&E attendances, smoking during pregnancy, employment of the mother, vaccination of the child and mother's confidence. For child's A&E attendances, vaccination and mother's confidence, the working-age respondents (age <= 65) derive higher utility from the negative outcomes taking place (i.e. children attends A&E during the first two years of their life, they do not have the recommended vaccinations, and mother does not feel confident that she can achieve her goals) compared to the older respondents (age > 65).
Preferences did not vary greatly by guardianship status. Differences were observed for birth weight and language development of the child. Parents/guardians received higher utility if the birth weight of the baby was below a healthy range and the language development of the child was not according to its age.

Discussion
This DCE has established the relative values that the general public places on the outcomes measured in the BBs trial. Over 1000 responses were collected from a sample that was broadly representative of the UK general public on key variables. The large sample is one of the strengths of this study as it enhances the statistical efficiency. The guidelines on conducting DCEs suggest that sample sizes in the range of 1000 to 2000 respondents will produce small confidence intervals, even if the experimental design is not particularly efficient 33 . This is particularly important in our study where the data were pooled from four designs and the statistical efficiency might have been compromised. Respondents were randomly allocated to complete one or all four designs to minimize the selection bias.
The study shows that the general public value more highly outcomes that affect children directly: whether the mother fulfils the child's needs and the bond she develops with the child, child's A&E visits, and their language development. Mother's smoking behaviour during pregnancy and the relationship she has with her partner are also ranked high in the   1428.399 (5) 1212.608 (5) 1814.054 (5) 2100.356 (5) 6930.774 (14) 6943.648 (14) Log    preference of the general public. Outcomes directly related to the mother, such as employment, education or her confidence in achieving goals or solving problems, are ranked much lower. In general, the results are intuitive, even more so when gender is taken into account. Women value the outcomes related to the mother higher than men. Although the direction of the preferences for both age and guardianship status appear counterintuitive, there are possible explanations for this. For example, parents may be more acceptant of variability and less rigid about what is considered "normal".

Gender
The data from the four designs were pooled by the use of two common attributes based on the assumption that the respondents' preferences for the 14 attributes would be the same regardless of whether they are presented all together or in separate designs 25 . This assumption would have led to the same coefficients for the common attributes in all designs. Quantitatively, this was not the case as the beta coefficients for the common attributes were different. Still, the ranking of the common attributes was the same across the four designs. This was confirmed in the pooled data: respondents value the attribute "A&E attendances for children" as being more important than the occurrence of second pregnancy. Also, the top five ranked attributes in the pooled model are the top ranked attributes in the separate designs. This provides reassurance that pooling the data would not produce misleading results.
This DCE included a broad range of outcomes, related to both the mother and the child. The advantage of including a large number of attributes lies on the fact that, by mirroring the outcomes measured in the BBs trial, one broadens the decision making context, where outcomes for different subjects (mother and child, in this case), and both health and non-health outcomes can be incorporated in the trading-off process. The probability of the uptake of the FNP can be estimated by combining the information on the effectiveness of the trial (i.e. which of the outcomes are improved by this intervention) with the preference weights for these outcomes.
However, there are a number of caveats. The choice task given to the respondents was somewhat simplified to make the task more approachable to lay persons. The attributes were presented in a binary format and no attempt was made to assign levels other than describing the attribute as a 'positive' or 'negative' outcome. This approach has led to the preference values being calculated for fairly broad attributes, which could be argued, do not reflect the nuanced outcome differences seen in the trial. An ideal scenario would have been to assign more levels, reflecting the outcomes observed in the trial. However, there were two reasons for adopting this approach: firstly, the nature of the research question might have been too specific for a proportion of the respondents. For example, people without children might have found it more difficult (e.g. the task is more abstract) to value the importance of certain child-related outcomes. Secondly, the large number of attributes called for a simple approach to keep the task manageable and avoid cognitive burn-out of respondents.
As an additional limitation, while respondents were able to trade-off attributes, it is not clear whether there are interactions between attributes that have been omitted from the analysis. Omission of important interactions may lead to biased parameter estimates, i.e. an under or overestimation of the relative value of an attribute.
Importantly, in order to be consistent with decision making bodies, such as NICE, the DCE responders were members of the general public. However, there is also an argument for considering the preferences of individuals who have experience in the condition of interest. Brazier et al. (2005) 34 call for greater debate on the subject of patients' valuation of health states. DCEs based on, for example, recent teenage mothers may yield differing estimates of the relative importance of attributes.
Despite these limitations, this DCE can be used to place the results of the BBs trial in a decision-making context, while taking into account the views of the general public for the preferred benefits or outcomes of FNP. An illustrative example for such an approach included two extreme scenarios where the BBs trial results were compared with a best and a worst-case trial output. The BBs trial demonstrated that the FNP results in small improvements in three of the secondary outcomes that were included in the DCE. No improvements were observed in the primary outcomes. This in turn, is translated into an extremely small probability of the FNP being accepted as a public health policy programme by the general public compared to the best-case scenario, where all the outcomes included in this DCE are improved by the intervention. Naturally, the probabilities of acceptance would change depending on what outcomes are being improved by the intervention but also on how highly these have been valued by the general public. Hence, a scenario where, for example, the trial demonstrates improvements in all the primary outcomes (therefore, the intervention is considered a successful one) but these are not valued highly by the general public resulting in a low probability of acceptance, would be entirely plausible. Nevertheless, the DCE provides a decision-making tool when the contribution of the general public or any other specific group of respondents is considered important.

Conclusion
Where complex public health interventions are expected to impact upon a range of potential beneficial outcomes, understanding how to use evidence to inform policy becomes even more important. Home visiting programmes, such as the Family Nurse Partnership (FNP), have been evaluated against a wide range of outcomes (clinical and non-clinical) both in the short-term, such as in the Building Blocks trial, and also in the much longer-term (such as in the US trials of NFP [35][36][37] ). There has also been some academic and policy debate [38][39][40] following publication of the main trial results of FNP in England 11 about the relative importance of the primary and secondary outcomes selected when commissioning the trial, which itself was only ever able to report on short-term benefit. For example, the two maternal primary trial outcomes were smoking in pregnancy and a second pregnancy within two years following the first child's birth. The former but not the latter were rated as of high value to respondents in the DCE and both are subject to further development work by the FNP programme itself post trial publication 41 . A greater understanding of the importance of a short inter-pregnancy interval in the relevant English context and how policy should respond remains an important task.