Developing a Minimum Data Set for a congenital abnormalities surveillance programme in Rwanda – a modified e-Delphi consensus study [version 2; peer review: 1 approved with reservations, 1 not approved] Previously titled: Developing a core outcome set for a congenital abnormalities surveillance programme in Rwanda – a Delphi consensus study

Background: In 2015 it was reported that approximately 300,000 newborns die within four weeks of birth every year, worldwide, due to congenital anomalies.  This represents approximately 11% of neonatal deaths. This has led scientists, clinicians and public health authorities to establish congenital abnormality registries (CARs). There is currently no CAR in Rwanda. In establishing such a registry, it was determined that the first step was to identify the Minimum Data Set (MDS) of items/variables and outcomes for the registry to ensure that the final results are meaningful and employable. This study aimed to use Delphi consensus methods to identify a methodologically robust MDS for a congenital abnormality surveillance programme in Rwanda. Methods: A three-round, modified Delphi study was undertaken between April and June 2017. Round 1 was a literature and internet search followed by an open and closed question round with experts in Rounds 2 and 3, respectively. Results: An initial draft MDS of 134 items was created from a review of 15 African studies and 14 international repository tools including the European Surveillance of Congenital Anomalies and the World Health Organization surveillance guidance. In total, 36 and 34 eligible participants were included in Rounds 2 and 3, respectively. A total of Open Peer Review


Introduction
Congenital abnormalities are defined as malformations of organs or body parts during development in utero, present at birth and are therefore of prenatal origin 1,2 . The prognosis of neonates with congenital abnormalities is often poor 3 . Annually, approximately 300,000 newborns die within four-weeks of birth, worldwide, due to congenital anomalies, representing approximately 11% of neonatal deaths 4 . The most common congenital abnormalities are congenital heart abnormalities, neural tube defects and Down syndrome 4 . Causes of congenital abnormalities are genetic, environmental or idiopathic. More than 90% of these newborns are born in low-and middle-income countries (LMICs) 5 . In resource-limited settings, despite the balance of burden of disease, there is limited epidemiological data about the rate, risk factors and types of congenital abnormalities. In Rwanda, data monitoring is already being undertaken via the Integrated Health Management Information System (HMIS) 6 . This is a nationwide data-surveillance programme, with health facilities reporting the total number of births with congenital anomalies, but no detail of individual cases. Though a positive start it doesn't capture rich enough data for meaningful objectives to be met.
When epidemiological data has been presented the outcomes described are commonly not consistent. The importance of prevalence of congenital abnormalities has lead scientists and public health authorities to establish congenital abnormality registries (CARs). These surveillance systems based on high-quality epidemiological data are required to identify preventable causes and for policymakers to plan care provisions 7 . A large number of CARs have already been established, predominantly in high-income countries. The European Surveillance of Congenital Anomalies (EUROCAT) initiative is a good example of this. EUROCAT aims to carry out epidemiologic surveillance of congenital anomalies in Europe 8 . There are several objectives to EUROCAT initiative, including; i) provision of essential epidemiologic information on congenital anomalies in Europe, ii) facilitating the early warning of teratogenic exposures, iii) evaluating the effectiveness of primary prevention, to assess the impact of developments in prenatal screening, and iv) acting as an information and resource centre regarding clusters or exposures or risk factors for concern 8 . Finally the EUROCAT initiative aims to act as a catalyst for the setting up of registries throughout Europe collecting comparable, standardized data 8 . These are admirable objectives for the European continent, and resource-limited settings, with significant burdens of congenital abnormalities, should aim to establish registries and surveillance programmes with equally ambitious goals. The World Health Organization (WHO) have also given high-quality guidance on congenital abnormality surveillance programmes 2 . This guidance includes similar objectives to EUROCAT and also gives further objectives such as detecting clusters (outbreaks) of congenital anomalies 2 .
When undertaking research or creating registries, the outcome measures should be valid, reliable and feasible 9 . That is, outcomes should adequately meet the criteria of truth, be sensitive to change and be easily applied and interpreted. They should also be relevant to the setting and the stakeholders who will engage with the data. More emphasis is being placed on ensuring the high quality of items measured in research and surveillance programmes. If research has not been conducted to identify the most appropriate items, several problems may impair the usefulness of the research findings in informing clinical practice. For example, researchers may choose items to suit their own needs, heterogeneous items can impair future synthesis of research findings, and without pre-defined items, it is difficult to know if publishing authors have neglected to include items found in their research 10,11 . It is for these described reasons that many researchers are starting their research and/or registry development with a significant piece of research work to identify the "core outcome set" (COS) in the relevant research field, which may also be known as a "Minimum Data Set" (MDS). During this process, a great deal of research energy and time is invested into identifying the items/variables and outcomes to ensure that the final results of the future research and data-collection are meaningful and employable. There are several considerations regarding the choice of method to use when developing a COS and/or MDS, which include factors such as the need for methodological rigor in the consensus process, ensuring a diverse range of stakeholder opinions, and finally financial constraints and carbon costs that might limit the practicality of face-to-face meetings 10,11 . The Delphi technique is a well-respected tool for establishing a COS and is a structured process utilizing a series of 'rounds' to gather information until group consensus is reached. Each Delphi round employs individuals across diverse geographical locations and diverse areas of expertise. The anonymous nature

Amendments from Version 1
We would like to thank the peer-reviewers for their valuable contributions to our study. The most significant change being the change in terminology from "Core Outcome Set" to "Minimum Data Set". The excluded participant has been removed from the demographic Table, and Table 1 and Table 2 have been aligned to ensure easier reading. An incorrect data point has been addressed in Table 1 and the body of the text modified accordingly. The reviewers have highlighted some important limitations that we had not considered and we have added these to the discussion section.
Any further responses from the reviewers can be found at the end of the article REVISED of the Delphi technique also avoids domination of the consensus process by one or a few experts 12 .
In Rwanda there is a long-term goal to develop a Rwandan surveillance program to provide monitoring and epidemiologic data that could be a first step in identifying risk factors and to improve the provision of care of the families of children with congenital abnormalities.
The WHO have given guidance that the variables to be included in a surveillance programme (registry) may vary, depending on the capacity and resources of the health-care system and surveillance programme 2 . This Delphi-study aimed to use consensus methods to identify a methodologically robust Minimum Data Set for a Rwandan congenital abnormalities' surveillance program. This MDS was intended to include a range of items, mamely; risk factors, clinical features, syndromes, and outcomes.

Study design
A three-round, modified Delphi-study was undertaken between April and June 2017. Reporting of the study is in accordance with the Sinha and Williamson (COMET, Core Outcome Measures In Effectiveness Trials) checklists for creating a COS using Delphi techniques 10,11 . No study protocol has previously been published for this study.

Round1
Aim. Round 1 aimed to produce a draft MDS using published resources. This is a common practice in Delphi studies with the justification that the number of possible domains and items to include in the MDS of a Congenital Abnormality Surveillance program is substantial. Without providing an initial draft MDS to participants the level of recruitment and engagement would be low 11 .

Search strategy.
A PubMed literature and Google internet search was performed to identify CARs (globally) or epidemiological studies investigating the prevalence and description of congenital abnormalities (African continent). PubMed was searched using Medical Subject Headings (MeSH) keywords and synonyms for: "newborn" AND "congenital abnormalities" AND "developing countries" (see Extended data, Additional Supporting File 1 for a full search strategy 13 ). The search was limited to human studies in the English language with no date limits imposed, the final search being undertaken on 26 th January 2017. Epidemiological studies providing a description of all congenital abnormalities of an entire population were included (See Supporting File 1). Articles looking at specific congenital abnormalities alone (e.g. cranial defects, congenital heart defect) were excluded. Google was searched using the above synonyms along with synonyms for "registries".
Obtaining items. When contact details were available in the article, authors of the articles/registries were contacted by email to gain their outcome sets, questionnaires and/or data-collection tools. When contact with the author was not possible the item set was extracted from the materials available (i.e. the journal article or website).
Coding. Individual items (e.g. gestation, cleft palate, Pierre Robin, etc.; see Additional file 2) were then coded for content and frequency. The item codes were categorized into domains (e.g. maternal details, clinical signs, etc.; see Table 1).

Round 1 consensus.
We aimed for a minimum of 15 item sets from either published papers or repositories. Consensus was predefined to include any domain or item found in two or more of the identified registries or journal articles. These domains and items were then added to the first draft of the MDS.

Selectionofstudyparticipants(Rounds2and3)
Inclusion criteria. Participants were all medically qualified physicians and needed current or previous experience of working in a resource-limited setting, such as Rwanda. All participants needed to have experience of caring for children with congenital abnormalities. Physicians were chosen as Round 1 will have identified existing epidemiological tools, and in Rwanda physicians are likely to be the major stakeholders in collecting and utilizing data.
Sampling/enrolment. Recruitment was undertaken from the following sources: i) Rwandan paediatricians (n=53) via the Rwandan Pediatric Association (RPA) records, ii) Rwandan pediatric residents (n=49) via the University of Rwanda (UR) records, iii) International paediatricians previously working on the Human Resources for Health programme 14 (n=35) via the Ministry of Health (MoH) records, and iv) Authors of journal articles from Round 1 of the Delphi process (n=15) via the correspondence address. Invitations were sent via email with a link to the questionnaire, which is available as Extended data 13 .

Samplesize(Rounds2and3)
We aimed to gain responses from a minimum of 15-30 respondents in each round, which is considered the required number for gaining consensus in Delphi techniques 15 . The response rate was predicted to be 10-20%. Therefore, 152 participants were invited.

Procedures:Round2
Feedback. Feedback from Round 1 was given to participants in the form of the first draft MDS, available as Extended data 13 .
Instructions to participants. Each domain was presented individually, with its respective items, followed by an open-question; "On reviewing the above outcomes for the domain of <<domain title here>> are there any ADDITIONAL outcomes that you feel that SHOULD be included in the Rwandan Congenital Abnormalities Surveillance Program? (You may list these or write free text, as you wish. You do not need to write a justification of your additions)". Participants were presented with an open-ended text-box where they could "free text" any additional items to include.

Coding and consensus.
Items suggested by participants were coded in Microsoft Excel. Consensus was predefined as a minimum of two independent persons suggesting an item prior to it being included in the MDS for Round 3 11 . The initial draft item set from Round 1 and the new items meeting consensus were combined to create the second draft MDS.

Procedures:Round3
Instructions to participants. In Round 3, closed questioning was employed. Each item of the second draft MDS was presented using a 1-9 point scale, as described by Guyatt and the GRADE development group 9,16 . Items were presented within their domain with the following instructions: "Where 1-3 are unimportant, 4-6 is ambivalent and 7-9 is important, how important is it that the following are included in the Rwandan Congenital Abnormalities Surveillance Program".

Feedback.
Feedback was given to participants by giving the frequency the item was described in Round 1 (as a percentage) or if it was a new addition from Round 2.

Consensus.
Consensus for inclusion in the final MDS was pre-defined as greater than 70% of participants scoring 7-9 (important) AND less than 15% of participants scoring 1-3 (not important) 11 .

Datacollection,managementandanalysis
In Round 1, items and domains were coded and input into Microsoft Excel. For Rounds 2 and 3, Google Forms was used to administer the questionnaire and collect data. A web link to the Google Forms questionnaire was sent individually to each participant via email. Questionnaires 13 for Rounds 2 and 3 were kept open for two weeks. An email reminder was sent once for each round. Participants were reminded of the importance of completing both Rounds 2 and 3 of the Delphi process and asked to complete both rounds. However, because of the fully anonymous nature of responses, non-responders from Round 2 were still invited to Round 3. New participants were not invited between Rounds 2 and 3. Data analysis was undertaken in Microsoft Excel and was descriptive based on the above defined consensus.

Ethicalissues
The project was undertaken as the Masters dissertation for AM at the University of Rwanda. This research proposal was therefore approved by the Institutional Review Board (IRB) at the University of Rwanda (155/CMHS IRB/2017). The proposal was not published online or in a peer-reviewed journal. A statement of consent was included in the email communication with participants. Responding to the questionnaire was deemed as informed consent.

Results
We undertook a modified Delphi-study. A study flow diagram is available in Figure 1).

Round1search
Our PubMed literature search revealed 545 results. Titles were reviewed along with the abstract, where necessary. We identified 15 relevant journal articles giving descriptions of congenital abnormalities on the African continent and meeting the inclusion criteria (see Additional Supporting File 1). This number of articles was appropriate to deliver consensus on key items for the initial draft MDS. Therefore, a further journal/literature search was not performed for studies outside of the African continent. The authors of the included papers were contacted and only one of the authors replied with details of their MDS. Therefore, items were extracted from the journal articles.
Regarding existing congenital abnormality surveillance programmes (registries), we were unable to identify any active congenital abnormality repositories/registries from the African continent. We therefore extended our search to other continents. This search revealed 14 repositories: seven from Europe (Belarus = 1, Finland = 1, France = 2, Georgia = 1, Malta = 1, United Kingdom = 1), four from North America (USA = 3, Canada =1) and one from Israel. The included datasets also included the tools from EUROCAT 8 and the WHO 2 which are the best described guidance on surveillance programmes.

Round 1: Creating the first draft MDS
All items from the 15 journal articles and 14 repositories were coded in Microsoft Excel. From these 29 sources a total of 283 items were identified of which 134 (47%) met our pre-defined criteria for consensus to be included in the first draft of the MDS. These items were categorized into 10 domains ( Table 1). The consistency of items found in the outcome sets from these was low with only eight of the 283 items (3%) being present in ten or more of the 29 articles/ repositories (Additional Supporting file 2). All eight of these items met the criteria for inclusion in the final MDS. Each of the 283 items was found in an average of 2.7 (9.3%) of the 29 journals/repositories (Standard deviation, SD=2.8). Each of the 134 items included in the first draft MDS items were found in a mean of 4.6 (15.8%) of the 29 journals/repositories (Standard deviation = 3.1). The most commonly described items were "spina bifida" and "cleft palate/lip" which were both present in 16 (55%) of the articles/repositories. Each journal/ repository described a mean of 27 items (SD=14). Repositories had more items than journal articles with averages of 30 and 24 items respectively.

Round2and3participants
A total of 37 and 34 participants (Table 2) responded, giving a response rate of 24% and 22% for Rounds 2 and 3, respectively. This exceeded our expected response rate of 10% and resulted in significantly more than the 15-30 participants needed for validity of each round. In Round 2, one participant had never treated children with congenital abnormalities and didn't meet the inclusion criteria. In total, 73% and 63% of participants treated children with congenital abnormalities either frequently or very frequently. Participants were experienced with a mean of 12 years and 11 years of pediatric experience for Rounds 2 and 3, respectively. Participants included general pediatricians, neonatologists, geneticists, neurologists and pediatric residents.

Attrition
Questionnaires were fully anonymous. Participants did not know the identities of the other individuals in the group, nor did they know the specific answers that any other individual gave. However, year of birth and initials were given by subjects in order to assess attrition rate. Of the respondents from Round 2, 17 (46%) contributed to Round 3, giving an attrition rate of 20 subjects (54%).

Round 2 MDS
Items were presented within their domain on a separate page of the electronic questionnaire. After coding, a total of 219 new items were suggested by participants (Table 3). In total, 62 (28%) of these items were either "general suggestions" or already found in this or another domain. There were therefore 157 genuinely new items. Of the 157 new items, 32 (20%) of these were independently suggested by two or more participants and therefore met the pre-defined definition of consensus and were added to the MDS and carried through into the final round of the Delphi process.

Round3
The questionnaire 13 was divided into ten sections reflecting the ten domains of the second-draft MDS. Items were individually presented in their respective domain. There were 166 items presented for scoring between 1 (non-important) and 9 (important) by participants. In total, 103 items (61%) met the pre-defined consensus criteria to be included in the final MDS.
The final steps were to: i) Ensure items were in appropriate domains, ii) Attach International Statistical Classification of Diseases and Related Health Problems (ICD) coding and naming conventions 17 to the items, iii) Place all clinical features (n=52) into anatomical systems, and iv) Alphabetise/ order items for ease of use (See Supporting File 3).

Discussion
This study aimed to create a MDS for use in a Rwandan Congenital abnormalities' surveillance programme. Using structured consensus methods and taking into account the perspectives of experienced pediatric clinicians, a MDS of 103 variables and items, within ten domains, has been created.

Draft-MDS developed in Round 1
Our PubMed and internet search identified 29 articles and repositories. There was a large number of excluded descriptive studies of specific abnormality types (e.g. neural tube defects only). The literature search has shown that research studies on congenital abnormalities have been done in Africa but wellstructured mechanisms for their surveillance are quasi-absent (i.e. journal articles are available but no repositories/registries were found); this is concerning when one considers the finding that more than 90% of newborns with congenital abnormalities are born in LMICs 4,5 . These findings support the need for upscaling of surveillance programs on the African continent where congenital abnormalities are responsible for a significant burden of disease.

Stakeholders
With a lack of structured mechanisms or tools to detect and follow-up neonates with congenital abnormalities in Rwanda, it was judged an essential first step to gain consensus from physicians who will case-find regarding which items to be included in the surveillance tool. The Delphi technique was found to be both cost-effective and practical with a methodological rigor to reach a large number of diverse experts and with an advantage of avoiding the negative effects of dominant individuals. Pediatricians are well placed to advise on the development of a surveillance programme for congenital abnormalities since they care for affected neonates and children in their daily practice. In Rounds 2 and 3, we received a higher than anticipated response rate and therefore gained more participants than expected. This was a welcome finding and therefore our results offer consensus of a larger body of experienced clinicians. The participants were also from several different settings giving a broader range of experience. The high response rate is also a sign of how important and relevant participants found the subject and hopefully reflects the "buy-in" regarding a future surveillance program.

Variables
The consistency of items described in the journals and repositories in Round 1 was low. We identified 15 relevant journal articles giving descriptions of congenital abnormalities on the African continent and meeting the inclusion criteria for Round 1 of our study. These studies did not describe rigorous methods for determining the items that they included in their studies. Each of the 134 items included in the first-draft MDS was found, on average, in only 16% of articles/repositories. In round-2 32 items were added to the MDS and carried through into the final round of the Delphi process. This could be explained by a number of factors, such as; that previous studies had not adequately identified items, or that our participants wanted a more comprehensive MDS.
The need for this Delphi study to develop a locally relevant MDS, was therefore supported by three factors of i. Lack consistency in the items found in round-1, ii. No description of rigorous methods in the previous studies and iii. The large number of items (n=32) added by our participants in round-2. The finding that the 14 repositories held a mean of 30 items suggests that our MDS may hold too many items. This may be due to the pre-defined threshold for consensus in Round 3. However, it is interesting to note the high number of new items suggested by participants in Round 2 which could suggest that these additional items are in fact needed or desired by our local professionals.

Application of findings
The next step in developing the Rwandan CAR is to field test (pilot) the MDS in the clinical environment. This is to ensure that completing data-collection using the MDS is feasible for physicians within day-to-day clinical practice. The WHO guidance on surveillance programmes describes that a programme may be population based or hospital/facility based and can use active or passive case ascertainment (case-finding) 2 . We intend to use the MDS developed in this study to commence passive case ascertainment in teaching hospitals, in order to establish baseline data and feasibility of such a surveillance programme.

Strengthsandweaknessesofthestudy
Strengths of the study include the fact that the Delphi technique is a commonly used consensus method with several advantages whilst minimizing some of the disadvantages associated with collective decision making, for example, domination by individual interests 9 . We felt it was a priority to include the variables from EUROCAT 8 and the WHO 2 guidance on surveillance programmes and have used Delphi methods to build on these data-sets. We used physicians from different settings and different levels of clinical experience to ensure there was no dominance of a particular domain.
Limitations include the fact that our participants were limited to clinicians who had experience of caring for children with congenital abnormalities. COS-STAD standards require patients (in this study parents) to be involved in the determination of important outcomes. In our study patients or their families, researchers and biostatisticians were not included. It was felt that parents/carers would not have a significant enough understanding of the risks, clinical findings and/or syndromes being assessed. In hindsight, involving collaborators from nursing and allied specialties could have been beneficial as they are often primary caregivers in this setting and therefore will be future stakeholders in future surveillance programmes. Further limitations were the attrition rate from Round 2 to Round 3. The use of a computer-administered questionnaire will have inadvertently excluded less computer literate participants. However, it has been found that participants who are willing to participate in consensus panels are generally representative of their colleagues 9 . A final limitation is the use of non-African repository data-sets in the first draft which may have biased the initial consensus. The first draft of the MDS included items from settings non-similar to our intended setting. The number of items found in Round-2 and -3 was long. Randomization was not undertaken to avoid questionnaire fatigue. The items were sub-categorised into domains in an attempt to minimize this. We have not looked at the data to assess or measure for this questionnaire fatigue. MDS and COS projects often include an in-person meeting to finalise the MDS and/or COS. This was not considered in our study. We used remote digital methods to ensure a wide range of participants without excluding the voice and opinions of anyone who would be unable to travel. A face-to-face meeting was also beyond the means of the study which was the Masters project of AM and had no funding available

Conclusions
This is the first MDS for congenital abnormalities identified in the literature from the African continent. It has been developed for use in Rwanda but could relevant for use in the region and other resource-limited settings. In using this MDS the researchers have the potential to reduce research waste and allow improved comparison. Our MDS has now been used in a feasibility study and the development of the registry continues. If our MDS were to be employed in a similar setting it is considered best-practice to minimize any modification of the items, however, each setting should ensure that the items are relevant and add any additional items that may be relevant to their own population and the objectives of their surveillance programme 18 . This project contains the following underlying data:

Open Peer Review © 2020 Al Wattar B.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Bassel H. Al Wattar
Warwick Medical School, University of Warwick, Coventry, UK Thank you for asking me to review this article describing efforts to develop a ''core outcome set'' for congenital abnormalities registry in Rwanda.
At its face value, I think the authors are inaccurate in calling this a core outcome set, it is rather an adoption of published outcomes in other national and international registries to what is applicable in Rwanda. In other words, this is data dictionary or minimally agreed dateset for a prospective national registry.
Such a step is certainly welcomed if a national registry is in planning to increase its value and impact on patient care. Still the key elements are missing from this exercise especially the involvement of multi-stakeholder including patient representative, health system researchers and policymakers. All of whom were absent in this Delphi.
I have no major comment on the conduct of the Delphi survey, but ultimately the impact of this dataset is only relevant to Rwanda with no generalizability to the medical literature.

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Academic obstetrician and gynecologist with experience in randomised clinical trials and core outcome sets

I confirm that I have read this submission and believe that I have an appropriate level of
would be of interest to the reader. Is it clear that they did not use a consensus process, and if not, how did they determine what to measure?

Methods
In round one, a long list of potential items has been created. This is not usually referred to as a 'draft COS' however since it consists of a list of any outcome ever measured by one or more groups rather than being based on any consensus process.
The list of items is long. Was there any randomisation to address the potential for survey fatigue? Could the authors examine the data for this problem? If not, this should be discussed as a potential limitation.
Pre-defining various elements of the process can reduce the potential for bias. Although no study protocol has been published in a journal, might there be one available online? Is there a Research Ethics Committee (REC) application where the design is described or did the authors determine that this work did not require REC approval? If the latter, a statement to this effect, with explanation, should be included.
It does not appear that an individual participant had the opportunity to review scores from other participants, to reflect on their own view, and then to rescore. The study design appears to be a series of two surveys therefore rather than a Delphi survey. This should be clarified.

Results
The authors state there was an ineligible participant but continue to include their results. This should be clarified and explained, and the tables amended as appropriate.
The addition of 32 items is high. What might this imply? How many of these additional items were in the final 103?
The number of new outcomes suggested in round 2 in the 'syndrome/diagnosis' domain is 6 in Table 1 but 8 in Table 3. Please clarify this inconsistency.  There were 8 items common to the previous registries. Were these 8 in the final set? This would be an interesting discussion point.

Discussion
When referring to the number of 'outcomes' in previous registries, is it outcomes or rather items?
The COS-STAD standards require patients (here parents) to be involved in the determination of important outcomes. Some discussion as to why parents were not involved in this project would be of interest to the reader.
MDS and COS projects often include an in-person meeting to finalise the set. Some discussion as to whether this was considered and, if so, why it was not pursued would be of interest to the reader.
Why is 'the use of non-African repository data-sets in the first draft' necessarily a limitation?
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. This article presents the items comprising a minimum data set for a registry of congenital abnormalities in Rwanda, determined through a survey of health professionals. It addresses an important topic, since the set up and maintenance of a registry is no small feat, and increasing the likelihood that key, relevant items are measured is critical.
I believe the authors are proposing a 'minimum data set' (MDS) rather than a core outcome set (COS). A MDS typically includes a COS plus additional variables (for example baseline, risk factors and treatment variables). Use of the term 'items' rather than 'outcomes', and 'MDS' rather than 'COS', throughout would be preferable therefore.
Response: Thank you for this feedback. We did find it difficult during the initial write-up to know which terminology would best describe the process. We have therefore changed the terminology in line with the reviewers feedback.

Background:
How did the 15 African studies reviewed determine items for their registries? This information would be of interest to the reader. Is it clear that they did not use a consensus process, and if not, how did they determine what to measure? Response: There was little or no description of how they chose the items. There was also a lack of consistency in the items that they chose. Together these two factors supported our own feeling that there was a need to develop this minimum data set. We have added some text to the discussion regarding the lack of description of how the previous studies identified their items.

Methods:
In round one, a long list of potential items has been created. This is not usually referred to as a 'draft COS' however since it consists of a list of any outcome ever measured by one or more groups rather than being based on any consensus process. Response: We didn't include all outcomes/items. Rather, we only included those that met a pre-defined threshold of consensus for this round, which is described in the methodology as "Consensus was predefined to include any domain or item found in two or more of the identified registries or journal articles. These domains and items were then added to the first draft of the MDS." Therefore, for this point we haven't made any change to the manuscript.
The list of items is long. Was there any randomization to address the potential for survey fatigue? Could the authors examine the data for this problem? If not, this should be discussed as a potential limitation. Response: We did not use randomization, and we certainly agree that this is a concern. We have added the following text to the limitations section of the manuscript. "The number of items found in Round-2 and -3 was long. Randomisation was not undertaken to avoid questionnaire fatigue. The items were sub-categorized into domains in an attempt to minimize this. We have not looked at the data to assess or measure for this questionnaire fatigue." Pre-defining various elements of the process can reduce the potential for bias. Although no study protocol has been published in a journal, might there be one available online? Is there a Research Ethics Committee (REC) application where the design is described or did the authors determine that this work did not require REC approval? If the latter, a statement to this effect, with explanation, should be included. Response: In the section "Ethical issues" we have added the text "The project was undertaken as the Masters dissertation for AM at the University of Rwanda". It already states that "This research proposal was therefore approved by the Institutional Review Board (IRB) at the University of Rwanda (155/CMHS IRB/2017)." We have therefore also added the text "The proposal was not published online or in a peer-reviewed journal." It does not appear that an individual participant had the opportunity to review scores from other participants, to reflect on their own view, and then to rescore. The study design appears to be a series of two surveys therefore rather than a Delphi survey. This should be clarified.
Response: In round-3 "Feedback was given to participants by giving the frequency the item was described in Round 1 (as a percentage) or if it was a new addition from Round 2.". We did not however give participants the opportunity to reflect on their own-score and rescore.

Results:
The authors state there was an ineligible participant but continue to include their results. This should be clarified and explained, and the tables amended as appropriate. Response: We only included the excluded participant into the demographics table, not in the remaining analysis. We wanted to give an understanding of everyone who responded to the invite to take part (response-rate), but agree this is confusing. We have removed the excluded participant from the demographics table, this has no impact on the rest of the data presented.
The addition of 32 items is high. What might this imply? How many of these additional items were in the final 103? Response: We feel that this implies that the initial studies/item sets from round-1 were not consistent and missed important items. We feel this supports the work that we have done to robustly identify our dataset.
The number of new outcomes suggested in round 2 in the 'syndrome/diagnosis' domain is 6 in Table 1 but 8 in Table 3. Please clarify this inconsistency. Response: Yes, this is an error. Thank you for spotting it. This has been one of the challenges of having no funding and therefore having no Delphi-specific software and relying on excel and manual data-analysis. We have gone back to our data. There were 8 items in round 1 and 8 in round 2. We have therefore corrected the table, figure 1 and body of the text accordingly In addition to this we realize that the domains in Table 3 were in a different order to that of Table 1, we have therefore amended this to make it easier for the reader (see below)  There were 8 items common to the previous registries. Were these 8 in the final set? This would be an interesting discussion point.

Response:
We are assuming that this is in reference to the statement "with only 8 of the 283 items (3%) being present in ten or more of the 29 articles/repositories (Additional Supporting file 2)." Yes, all of these items were included in the final data-set, we have included this into the discussion.

Discussion:
When referring to the number of 'outcomes' in previous registries, is it outcomes or rather items? Response: Yes, it is items. We have amended this throughout the text.
The COS-STAD standards require patients (here parents) to be involved in the determination of important outcomes. Some discussion as to why parents were not involved in this project would be of interest to the reader. Response: We had previously included a comment about this in the limitations. But we have also added this comment about COS-STAD standards.
MDS and COS projects often include an in-person meeting to finalise the set. Some discussion as to whether this was considered and, if so, why it was not pursued would be of interest to the reader. Response: We have added this into the limitations, along with the following statement "This was not considered in our study. We used remote digital methods to ensure a wide range of participants without excluding the voice and opinions of anyone who would be unable to travel. A face-to-face meeting was also beyond the means of the study which was the Masters project of AM and had no funding available" We have also changed the title to "e-Delphi" to make this more clear from the start.
Why is 'the use of non-African repository data-sets in the first draft' necessarily a limitation? Response: Maybe because the needs of a data-set in a developed nation may be different to those in this setting. Therefore, including them may have biased the early consensus.

Conclusion:
The authors conclude that a new set of registry items should be developed for each setting. I believe this approach could contribute to research waste. I would recommend that each group wishing to implement a standardised set of registry items should consider existing sets first, to assess their generalisability to the setting at hand, and critically appraise the methods used. If a new set is still needed for the new setting, then a consensus process should be followed. Response: Sorry, the word "develop" was poorly chosen. We have reworded this section to give a better explanation of the point we were trying to get across.