Risk factors for major external structural birth defects among children in Kiambu County, Kenya: a case-control study [version 1; peer review: 1 approved, 2 approved with reservations]

Background: Although major external structural birth defects continue to occur globally, the greatest burden is shouldered by resource-constrained countries largely with no surveillance systems. To the best of our knowledge, few studies have been published on the risk factors for these defects in developing countries. The objective of this study was to identify the risk factors for major external structural birth defects among children in Kiambu County, Kenya. Methods: A hospital-based case-control study was used to identify the risk factors for major external structural birth defects in Kiambu County. A structured questionnaire was used to gather information retrospectively on exposure to environmental teratogens, multifactorial inheritance, and sociodemographic-environmental factors during the study participants' last pregnancies. Descriptive analyses (means, standard deviations, medians, and ranges) were used to summarize continuous variables, whereas, categorical variables were summarized as proportions and percentages in frequency tables. Afterward, logistic regression analyses were conducted to estimate the effects of the predictors on major external structural birth defects in the county. Results: From the multivariable analyses, maternal age ≤34 years old, (aOR: 0.41; 95% CI: 0.18-0.91; P=0.03), and preceding siblings with history of birth defects (aOR: 5.21; 95% CI; 1.35-20.12; P =0.02) were identified as the significant predictors of major external structural birth defects. Conclusions: Maternal age ≥35 years old, and siblings with a history of birth defects were identified as the risk factors for major external structural birth defects in Kiambu County, Kenya. This pointed to a need to create awareness among couples against delaying childbearing beyond 35 years of age and the need for clinical genetic services for women of reproductive age with history of births affected Open Peer Review


Introduction
Worldwide, an estimated 7.9 million children are born every year with a birth defect, of which approximately 3.3 million die before age five and around 3.2 million could be physically disabled for life 1,2 . More than 94% of such defects occur in the developing countries where about 95% of these children do not survive beyond childhood 1 . Birth defects are defined as abnormalities of body structures or functions that develop during the organogenesis period (first-trimester of gestation) and are detectable during pregnancy, at birth, or soon after 2,3 . These defects may be classified as major when associated with significant adverse health effects requiring medical/surgical care; otherwise, they are described as minor 1,2 . Alternatively, they can be classified as external when visible at birth or soon after; or internal when advanced medical imaging techniques are required for their detection [4][5][6] . Consequently, the phrase 'major external structural birth defects' (MESBDs) denotes congenital physical abnormalities that are clinically obvious at birth or soon after which call for medical and/or surgical interventions 1,2 . The causes of these defects can be classified into three categories: (i) identifiable environmental factors (teratogens/micronutrient deficiencies); (ii) identifiable genetic factors; and (iii) complex genetic and idiopathic environmental factors, described as multifactorial inheritance 1,4,7-10 . One-third of these causes are attributed to identifiable environmental and genetic factors, whereas the rest are believed to be multifactorial inheritance-related 1,4,7-10 . Additionally, environmental endowment of women of reproductive age is thought to operate through their socioeconomic and sociodemographic characteristics leading to causes of MESBDs, described as sociodemographicenvironmental factors 1,4,[8][9][10] . Completing more years of education could improve maternal health because educated women are more likely to make informed reproductive health choices than those with low levels of education with a view to improving birth outcomes [11][12][13][14] . Some of the notable maternal decisions include planned pregnancy, preconception folic acid intake in anticipation of conception, and prompt prenatal care 11,13,15-20 . Maternal occupation could be dependent on educational levels nonetheless occupations such as farming could expose women of reproductive age to teratogenic pesticides 21 . Organogenesis occurs in the first eight weeks of gestation; however, approximately half of pregnancies are usually unplanned/unintended, thus not recognized until the end of the first trimester 1,4,[22][23][24] .
To our knowledge, many studies on the risk factors have been published in developed countries, however, such publications are scanty in developing countries owing to the rarity of the defects, unplanned/unintended pregnancies, and difficulties in identifying these women until the end of the first trimester when the defects have already formed 4 . To address this gap, this study investigated maternal periconceptional exposure to environmental, sociodemographic-environmental, and multifactorial inheritance-related risks factors for MESBDs in Kiambu County, Kenya. The study assessed: maternal periconceptional exposure to pesticides and teratogenic therapeutic medicines proxied by maternal chronic illnesses (epilepsy and depression); multifactorial inheritance proxied by the history of siblings with birth defects, sex of the last born child, nature of pregnancy, and parity; and sociodemographic-environmental factors consisting of maternal age, level of education, occupation, and adequate prenatal care proxied by gestational age and preconception folic acid intake. The findings of this study could provide great public health opportunities for the formulation of specific treatment strategies, preventive measures, risk-based surveillance systems, and clinical genetic services for the most prevalent MESBDs, regionally and nationally. Consequently, the objective of this study was to identify the risk factors for MESBDs among children in Kiambu County, Kenya.

Study design and settings
A hospital-based case-control study was conducted to identify the risk factors for MESBDs. The study participants were recruited as they presented to the child welfare clinics, neonatal/paediatric units and occupational clinics for care during data collection period from May 31 st 2018 to and July 31 st 2019. A casecontrol design was the optimal design for this study considering its suitability for the investigation of rare outcomes, as is the case with MESBDs. Even though a population-based design would have been more preferable, the ease of recruiting case and control subjects within the hospital settings disproportionately favoured the hospital-based design. This was an observational study, therefore was reported as per the STROBE guidelines 25 .
The study was conducted in 13 hospitals comprising three county referral hospitals (Kiambu, Gatundu, and Thika), eight sub-county hospitals (Karuri, Kihara, Wangige, Nyathuna, Lari, Tigoni, Lussigetti, and Kigumo), and two faith-based hospitals (Presbyterian Church of East Africa Kikuyu Orthopedic and African Inland Church Cure International) situated within Kiambu County, Kenya. Notably, neither population-based or hospital-based surveillance systems for MESBDs existed in the county nor the study hospitals. Nonetheless, cases detected by primary health providers during childbirth and in neonatal care were recorded for the compilation of monthly hospital reports and subsequent entry into the District Health Information System (DHIS). The cases were drawn from Kiambu, Thika, Gatundu, Tigoni, Kikuyu, and Cure hospitals, which provided occupational and rehabilitative health services to children with MESBDs. The controls, on the other hand, were drawn from Kiambu, Gatundu, Thika, Karuri, Kihara, Wangige, Nyathuna, Lari-Rukuma, Tigoni, Lussigetti, and Kigumo hospitals, which provided child welfare services to the under-fives. Kiambu is the second-most densely inhabited county with an estimated population of 2.4 million people out of an estimated national population of 47.5 million 26 . Its economic mainstay is largely agriculture, comprising tea, coffee, and dairy farming 26 . Of the county's total estimated population, approximately 2.2% aged ≥5 years are living with lifelong disabilities 26 . A study carried out in the county between 2014 and 2018 observed defects of the musculoskeletal system as the most prevalent single system defects followed by central nervous, orofacial clefts genital, ocular, and anal organ defects 27 .

Study population and eligibility of participants
The study population consisted of children aged ≤5 years old seeking health services at the study hospitals during the study period spanning from May to July 2019. All children whose mothers consented to participate in the study were recruited.

Case definition and recruitment
Cases were defined as children aged ≤5 years born with at least one MESBD to resident women of Kiambu County and seeking health care services at the neonatal units, paediatric wards, child welfare clinics and/or occupational therapist clinics of the study hospitals during the three-month study period. The Research Assistants (RAs) liaised with team leads of the departments listed above to identify cases of MESBDs. The team leads had been working in these departments, thus were conversant with the cases seeking services. The team lead invited the mothers of the children who met the case definition to comfortable private rooms within the departments where informed consent was sought and interviews conducted by the RAs. All cases that met this definition and whose carers consented to participate were prospectively recruited into the study until the required sample was attained (see Sample size determination).

Control definition and recruitment
Controls were children aged ≤5 years born without any forms of birth defects to resident women of Kiambu County and attending routine child-welfare clinics at the study hospitals during the same three-month study period. The Research Assistants liaised with team leads of the child welfare clinics to identify the children without any form of birth defects and were seeking routine immunization, and growth monitoring services. The team leads had been working in these clinics, hence were familiar with most of the under-fives seeking the services. These services are provided between 8.00 am and 5.00 pm from Monday to Friday; the team leads introduced the RAs who then briefed the potential participants on the study objectives. Because of the relatively large number of controls available, they were selected by simple randomization using sealed envelopes upon definition of the sample population and frequency-matched to the cases by the day of presentation.

Sample size determination
The sample size was estimated as per the Kelsey JL et al. 28 formula specified for case-control studies as follows: - Where: n 1 is the number of cases and n 2 is the number of controls; p 1 is the proportion of cases whose caregivers did not begin prenatal care in the first trimester (primary exposure), p 2 is the proportion of controls whose care-givers did not begin prenatal care in the first-trimester set at 57% 22,23 . Remarkably, Z α /2 (1.96) and Z β (-0.84) are the values specifying the desired two-tailed confidence level (95%) and statistical power (80%), respectively. The odds ratio (OR) for the effect of the primary exposure (cases whose caregivers did not begin prenatal care in the first trimester) was hypothesized to be 3.0 22,23 . The ratio (r) of unexposed to exposed individuals was set at 3, and given the estimates, a total sample size of 408 participants was derived (102 cases, and 306 controls).

Data collection process and study variables
Before data collection, four nursing graduate interns were recruited and trained as RAs on sound interviewing techniques, and information derivation/validation from antenatal care (ANC) booklets. This was to ensure the data collection process spanning three months (May 31 st to July 31 st, 2019) was conducted in a standardized manner. The ANC booklet contains maternal profile, medical/surgical history, previous pregnancy history, clinical notes, and physical examination findings on ANC visits among others. Maternal profile includes name, age, parity gravidity, height, weight, last menstrual period (LMP), expected date of delivery (EDD) and date of first ANC. Face-to-face structured questionnaires (see Extended data) were administered to the mothers of the study participants by RAs in comfortable secluded rooms within neonatal units and occupational therapy clinics for cases and child welfare clinics for the controls. Data were gathered retrospectively on exposures to environment-teratogens (pesticides and teratogenic medicines proxied by chronic illnesses), multifactorial inheritance (parity, nature of pregnancy, history of siblings with MESBDs and sex of the lastborn child) and sociodemographicenvironmental factors (maternal age, education level, occupation, and adequate prenatal care proxied by gestational age and preconception folic acid intake). The predictors were assessed as shown in Table 1.
A conceptual framework depicting the predictor-outcome relationship is displayed in Figure 1. The flow chart of the simple-random systematic sampling strategy is shown in Figure 2.

Minimizing bias
Considering potential biases inherent in case-control studies that were likely to invalidate the study results, deliberate attempts were made to minimize their occurrence. First and foremost, the research assistants were trained on sound interviewing techniques and information derivation/validation from ANC booklets to minimize interviewer and minimize information biases, respectively. In a bid to minimize recall bias, gestational age (weeks) at the first ANC were estimated from the dates  of the last menstrual period and dates of the first ANC obtained from the ANC booklets.
Data processing and statistical analysis Following data collection, filled questionnaires were manually checked daily for accuracy and completeness and subsequently entered into a Microsoft Excel spreadsheet (Microsoft Office Professional Plus 2019) by two independent data managers to reduce potential errors. The excel dataset was validated and exported to Stata software version 14.0 (Stata Corporation, Texas, USA) for further cleaning, coding, and analyses. Descriptive analyses (means, medians, standard deviations, and ranges) were used to summarize continuous variables, whereas proportions and percentages for categorical variables were generated and presented in frequency tables. Afterward, the effect of each predictor on the odds of MESBDs was assessed using univariable logistic regression models at a liberal P-value (P≤0.20) 29 . Gestational age (weeks) at first ANC as a continuous variable was categorized into groups (≤8 weeks and ≥9 weeks) for evaluation in the univariable analyses 1,4,22-24 . Additionally, parity as a continuous variable was grouped into two groups; primiparous or multiparous categories for assessment in the univariable analyses 30,31 . However, maternal age as a continuous variable was insignificant in the univariable analyses, thus, recategorized into two groups; ≤34 years, and ≥35 and reassessed for statistical significance; women aged at least 35 years have previously been reported to have an increased likelihood of giving birth to children with MESBDs 32 . Variables found statistically significant in the univariable analyses were fitted to a multivariable model where a backward stepwise approach was used to eliminate variables from the model at P-value >0.05. To minimize the confounding effects, elimination of nonsignificant predictors was only considered when their exclusion from the model did not yield more than a 30% change in the effects of the remaining variable 29 . Two-way interactions were fitted between the remaining variables of the final model and assessed for significance. A Hosmer-Lemeshow test was used to assess the goodness of fit of the logistic model, with a P-value of >0.05 being suggestive of a good fit.

Results
A total of 408 study respondents (102 cases and 306 controls) were enrolled in this study.  (Table 2). Environmentalteratogens: Of the 408 study respondents, 15 (3.68%) were exposed to farm-sprayed pesticides, of which four (3.92%) were in the case group and 11 (3.59%) were in the control group (Table 2).

Multifactorial inheritance:
Of the 408 study respondents, 404 (98.77%) had single gestations for the current child, of which 99 (97.06%) and 304 (99.35%) were in the case and control groups, respectively (Table 2). Of the study participants, 15 (3.68%) had given birth to children with birth defect in previous gestations, with 9 (8.82%) in the case group and 6 (1.96%) in the control group (Table 2).

Logistic regression analyses
Notably, all the factors assessed for statistical significance in the univariable analyses were associated with MESBDs at P≤0.20; age, education, occupation, sex of the lastborn child, history of siblings with birth defects, preconception folic acid intake, nature of pregnancy, pesticide exposure, chronic illnesses, parity, gestational (age) weeks at first ANC, and ANC beginning eight weeks post-conception (Table 3). Subsequently, these variables were fitted to the multivariable model for the final analysis, except gestational age at first ANC, education, occupation, and prenatal care beginning eight weeks post-conception being distal relative to pesticide exposure and preconception folic acid intake ( Figure 1).
In the multivariable analysis, only maternal age, and history of siblings with MESBDs were shown to be significant predictors at a 5% significance level (Table 4). Compared to women aged     (Table 4).

Discussion
To our knowledge, this was the first case-control study conducted to identify the risk factors for MESBDs in the entire country. The study findings corroborated other studies that maternal age greater than 34 years had a strong association with the occurrence of MESBDs 34 . The study observed that women aged ≤34 years old were 59% less likely to give birth to children with MESBDs compared to those aged over 35-years-old. Older maternal age has been strongly associated with MESBDs such as neural tube defects and orofacial clefs 34 . Maternal age is a multifaceted risk factor whose mechanisms of action in the occurrence of MESBDs are underpinned by human biology and socio-economic endowment among women of reproductive age. From the biologic standpoint, genetic mutations and accumulation of chromosomal aberrations during the maturation of male germ cells have been attributed to the occurrence of MESBDs Nevertheless, some limitations were inherent in this study; there was a likelihood of differential recall bias among the study respondents; cases were more likely to remember their preconception period owing to the experience of MESBDs in the last birth than the controls, thus could affect their estimates.

Conclusions
This study showed that maternal age and history of siblings with MESBDs were the predictors of MESBDs in Kiambu County in Kenya. This pointed to a need to create awareness among couples not to delay childbearing beyond 34 years of age and provision of clinical genetic services such as genetic counseling and screening for families with a history of birth defects. To address this burden, the county should begin by designing and formulating a hospital-based surveillance framework for the most MESBDs to inform specific public health interventions aimed at controlling and preventing specific MESBDs. Additionally, creating public awareness of the risk factors and prevention strategies for these defects through short message during media broadcasts, mobile phone digital platforms, community dialogues, and roadshows. Further, we recommend that similar studies be conducted nationally to inform surveillance, prevention, and control strategies for the most common MESBDs. This is a good and important research area for newborn health particularly now when a lot has been done on infectious unlike non infectious diseases. We have observed a great decrease of infant mortality given the available maternal and newborn interventions on infectious diseases. A higher contribution of non infectious disease particularly birth defects may be observed on neonatal and infant mortality with time. Below are my inputs and comments regarding this study;

Data availability
The title, abstract and introduction are well written.
○ Current citations were used.

○
The study design is appropriate however selection of cases was not appropriate given the study title and objectives. It can be admitted as one of study limitation.
○ Cases were sampled from child welfare clinics, neonatal/pediatric units, occupational and rehabilitation clinics. All these data sources represent survivors of MESBDs and most probably non fatal MESBDs. It is difficult to get fatal MESBDs like neural tube defects (NTDs) cases from this subpopulation as majority will not survive to meet them in rehabilitation clinics.

○
The ascertainment period from the case definition is too high (5years and below). This may lead to potential recall bias as it will be very difficult for a mother to remember what happened in her pregnancy in the 3-4 years ago. Again may lead to recruitment of survivors and non fatal MESBDs cases. This could be mitigated for at least to consider/restrict enrolment into the study for children below 1 or 2 years only.
○ I understand well that the data sources were the above mentioned clinics which are complimented by the ANC booklets. However the methodology section again mentioned about DHIS and I was wondering whether it was also another data source which was used. It needs clarity for the reader to well understand sources of data for this study.

○
The methodology section need more clarity on maternal age. Is it the age of the mother during conception of the referred case? or the age of the mother during the data collection? It is also very important to define "residence" as it has implication on maternal exposures. The residence is important during conception and antenatal period. This is the period when environmental exposures can have impact on the unborn child. There is no any significance of considering residence post delivery.
In the univariable analyses, why p-values for the reference categories are included?
The discussion is a bit shallow. Comparison with more literatures, more in-depth look in to the implications and significances of the findings, and addressing also key relevant factors without significant association in the current study can improve the Discussion part. The paternal age was also mentioned as key factor in previous studies but not assessed in the current study. Why? There are also other possible limitations not mentioned. E.g. survivor bias and not controlling for some relevant variables in the multivariable analysis like the paternal age.
The conclusion is a bit beyond the scope of the study. E.g. awareness level of couples or the community is not assessed. Detailed and in-depth discussion by citing other relevant literatures can help readers to better understand the situation and to deduce more appropriate conclusions.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes