Keywords
DHS, MICS, surveys, maternal and child health
DHS, MICS, surveys, maternal and child health
Non-government and civil society organizations spend substantial time and resources collecting baseline data in order to plan and implement health interventions with marginalized populations, and to measure the impact of those interventions (Data for Impact, 2019). Typical methods involve baseline and endline household surveys, where the household residents are interviewed and asked a hundred or more questions about asset ownership, mother and child health, diet, health system access, and other topics of interest. The costs of these surveys vary depending on design, methods, sample size, survey length, and local context (Data for Impact, 2019), but in the authors’ experience tens of thousands of dollars is typical, and in some cases, much more. Depending on the number and nature of questions, interviews can be over an hour long, placing a burden on the respondents. In addition, the accuracy of the indicator estimates in NGO-led surveys may be insufficient for project design and monitoring purposes, due to relatively small sample sizes and the inherent high variability of the indicators of interest.
Meanwhile, estimates of numerous health and social indicators in many countries already exist in publicly available datasets, such as the Demographic and Health Surveys (DHS), supported by USAID (U.S. Agency for International Development, 2018), and the Multiple Indicator Cluster Surveys (MICS), supported by UNICEF (UNICEF, 2020), and it is worth considering whether these could serve as estimates of baseline conditions. DHS/MICS provide standardized data collected using rigorous methods and large sample sizes, and datasets are available on request for free. They are designed to be representative at the national, regional and provincial level (but rarely at lower levels, such as district and village, where NGOs are working), and probably exclude homeless, institutionalized and nomadic populations (Carr-Hill, 2013). DHS/MICS are collected every three to ten years so there may up to ten-years gap between DHS/MICS data collection and the baseline conditions that the NGO wants characterized. Although some indicators’ descriptions have been modified and improved over time, caution is taken to ensure that data are directly comparable across countries, regions and years (Hancioglu & Arnold, 2013; UNICEF, 2020; U.S. Agency for International Development, 2018). DHS/MICS surveys are adapted to specific country needs and are conducted by well-trained interviewers who have access to tools and guidelines for quality assurance throughout (UNICEF, 2020; U.S. Agency for International Development, 2018).
Using publicly available data to complement or replace NGOs’ primary data collection for project baseline measures and project monitoring would save valuable resources, reducing the burden on data collectors and respondents alike. A few studies have compared estimates between DHS/MICS and NGO surveys. One found that they provided very different estimates of electricity and water access in Kenya, Tanzania, and Uganda (Carr-Hill, 2017), and a second found that DHS and a NGO-led survey provided similar estimates of several maternal and child health estimates in Rwanda (Langston et al., 2015). Other studies found that estimates of the market share of faith-based health care providers by DHS and NGO surveys in sub-Saharan Africa were within 5 to 50% of each other (Wodon et al., 2012), and the confidence intervals for the difference between Lot Quality Assurance Sampling (LQAS) and DHS district-level estimates were within +/-10% for 15 of 37 health indicators (Anoke et al., 2015). Therefore, no consensus exists on the potential for DHS/MICS to substitute NGO surveys.
We hypothesized that publicly available data can provide estimates of baseline conditions similar to those reported in NGO baseline reports when matched as closely as possible for location, year, and season of data collection. We tested this hypothesis by comparing indicator estimates from NGO reports with estimates calculated using DHS/MICS.
We collected and retained a sample of 46 NGO baseline reports through a combination of internet search and personal contacts with Canadian and Vietnamese NGOs using the following selection criteria:
i) household survey (n>100) which used valid methods and representative sampling to generate point estimates of maternal, newborn and child health indicators;
ii) conduced between 2005 and 2019;
iii) in a low- or middle-income country.
The baseline reports from NGOs working on maternal, newborn and child health covered 23 countries spanning South Asia (Bangladesh, India, Pakistan), Africa (Burkina Faso, Ethiopia, Ghana, Kenya, Liberia, Malawi, Mali, Mozambique, Nigeria, Senegal, South Sudan, Tanzania, Zambia), South/Central America (Bolivia, Honduras), the Caribbean (Haiti), and SE Asia (Laos, Myanmar, Philippines, Vietnam) (Table 1) (Berti, 2021). From the reports, we extracted: country name, NGO name, dates of data collection, population of study, inclusion/exclusion criteria, indicator name and definition, sample size (total and n for each indicator), and the indicator estimate (percentage and standard deviation (SD) if available).
NGO | DHS/MICS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Country | Source | Year | Sample size | Geographical location | Level* | Source | Year | Sample size | Geographical location | Level* |
Bangladesh | A&T | 2010 | 4,400 | Divisions of Dhaka, Chittagong, Rajshahi, Khulna, Barisal, and Sylhet | 3rd | DHS | 2007 | 4,923 | Divisions of Dhaka, Chittagong, Rajshahi, Khulna, Barisal, and Sylhet | 3rd |
Bangladesh | PLAN (BORN) | 2016 | 900 | Upazilas of Pirgachha, Pirganj, Mithapukur, Kaunia and Gangachara (rural area only) and district of Rangpur | 1st, 2nd | DHS | 2014 | 265 | Division of Rangpur (rural area only) | 3rd |
Bangladesh | NIMS | 2017 | 963 | Divisions of Dhaka, Chittagong, Khulna, Rajshahi, Sylhet, Barisal | 3rd | DHS | 2014 | 409 | Divisions of Dhaka, Chittagong, Khulna, Rajshahi, Sylhet, Barisal | 3rd |
Bangladesh | PLAN (SHOW) | 2016 | 864 | Districts of Barisal, Chittagong and Rangpur | 2nd | DHS | 2014 | 1,314 | Divisions of Barisal, Chittagong and Rangpur | 3rd |
Bangladesh | WV | 2018 | 33,600 | National and by districts (Barisal, Pirojpur, Bandarban, Chittagong, Comilla, Dhaka, Gazipur, Gopalganj, Tangail, Bagerhat, Satkhira, Mymensingh, Netrakona, Sherpur, Naogaon, Rajshahi, Dinajpur, Nilphamari, Rangpur, Thakurgaon, Sunamganj, Sylhet) | 2nd, 5th | DHS | 2014 | 4,494 | National and by divisions (Barisal, Chittagong, Dhaka, Khulna, Rajshahi, Rangpur, Sylhet) | 3rd, 5th |
Bangladesh | WV (ENRICH) | 2016 | 1,323 | Districts of Thakurgaon and Panchagarh | 2nd | DHS | 2014 | 550 | Division of Rangpur | 3rd |
Bolivia | PLAN | 2019 | 214 | Regions of Chuquisaca, La Paz, Cochabamba, and Potosí | 4th | DHS | 2008 | 867 | Regions of Chuquisaca, La Paz, Cochabamba, and Potosí | 4th |
Burkina Faso | WUSC | 2016 | 1,005 | Regions North, Central-West and East | 4th | DHS | 2010 | 2,709 | Regions North, Central-West and East | 4th |
Ethiopia | A&T | 2010 | 3,000 | Regions of Tigray and SNNP | 4th | DHS | 2005 | 1,800 | Regions of Tigray and SNNP | 4th |
Ethiopia | PLAN (BORN) | 2017 | 905 | Zones of North Gondar, South Gondar and West Gojjam and region of Amhara | 3rd, 4th | DHS | 2016 | 369 | Region of Amhara | 4th |
Ethiopia | CARE | 2016 | 1,261 | Zones of East and West Hararghe and region of Afar | 3rd, 4th | DHS | 2016 | 1,630 | Regions of Oromia and Afar | 4th |
Ethiopia | NIMS | 2017 | 440 | Regions of Amhara, Tigray, Oromia, Benishangul-Gumuz, and SNNP | 4th | DHS | 2016 | 508 | Regions of Amhara, Tigray, Oromia, Benishangul-Gumuz, and SNNP | 4th |
Ethiopia | PLAN | 2018 | 537 | Regions of Amhara and SNNP | 4th | DHS | 2016 | 1,651 | Regions of Amhara and SNNP | 4th |
Ghana | PLAN (SHOW) | 2014 | 831 | Intervention/control districts in the regions of Eastern, Northern, and Volta | 2nd | DHS | 2014 | 775 | Regions of Eastern, Northern, and Volta | 4th |
Haiti | PLAN (SHOW) | 2016 | 860 | Communes of Fort-Liberté, Ouanaminte, and Trou-du-Nord | 2nd | DHS | 2012 | 237 | Department of North-east | 3rd |
Honduras | Red Cross | 2007 | 300 | Departments of Copán and Santa Bárbara | 3rd | DHS | 2005/06 | 524 | Departments of Copán and Santa Bárbara | 3rd |
India | Eficor | 2012 | 300 | District of Pakur | 2nd | DHS | 2005/06 | 620 | State of Jharkand | 3rd |
India | IntraHealth | 2010 | 14,090 | District of Pakur and Uttar Pradesh | 2nd | DHS | 2005/06 | 1,649 | States of Jharkand and Uttar Pradesh | 3rd |
Kenya | NIMS | 2017 | 3,941 | Provinces of Rift Valley, Western, Nyanza, Eastern, Coast | 3rd | DHS | 2014 | 12,011 | Provinces of Rift Valley, Western, Nyanza, Eastern, Coast | 3rd |
Kenya | Red Cross | 2012 | 154 | Districts of East Pokot, Central Pokot, and East Marakwet | 2nd | DHS | 2008/09 | 694 | Province of Rift Valley | 3rd |
Kenya | WV (ENRICH) | 2016 | 1,274 | Counties of Elgeyo Marakwet and Baringo (subdivision of the before called Rift Valley province) | 2nd | DHS | 2014 | 4,760 | Province of Rift Valley | 3rd |
Laos | NIOPH | 2018 | 115 | Province of Vientiane | 3rd | MICS | 2016/17 | 3,560 | Region North | 4th |
Laos | The World Bank | 2016 | 7,355 | Provinces of Phongsaly, Oudomxay, Houaphan, Xaiyabouly, Borlikhamxay | 3rd | MICS | 2016.5 | 7,131 | Region North | 4th |
Liberia | Red Cross | 2012 | 783 | Counties of Bomi, Gbarpolu, and Grand Gedeh | 3rd | DHS | 2013 | 848 | Counties of Bomi, Gbarpolu, and Grand Gedeh | 3rd |
Malawi | CARE | 2017 | 708 | Traditional authorities of Kasakula, Kalumo, Dzoole, Kayembe and districts of Ntchisi and Dowa | 1st, 3rd | DHS | 2015/16 | 925 | Districts of Ntchisi and Dowa | 3rd |
Mali | PLAN (BORN) | 2017 | 907 | Region of Sikasso | 4th | DHS | 2012/13 | 714 | Region of Sikasso | 4th |
Mozambique | CARE | 2017 | 1,262 | Districts of Funhalouro and Homoine and province of Inhambane | 2nd, 3rd | DHS | 2011 | 570 | Province of Inhambane | 3rd |
Mozambique | PLAN | 2019 | 5,921 | Districts of Moma, Mogovolas, Nampula, Eráti, Memba, and Nacala Porto | 2nd | DHS | 2011 | 358 | Province of Nampula | 3rd |
Myanmar | WV | 2016 | 831 | Village of Thabaung | 1st | DHS | 2015/16 | 275 | Region of Ayeyarwaddy | 4th |
Nigeria | PLAN (BORN) | 2016 | 1,658 | Local Government Areas of Bauchi, Dass, Katagum, Misau, Ningi, Alkaleri, Bogoro, Ganjuwa, Giade, Shira and state of Bauchi | 2nd, 3rd | DHS | 2013 | 577 | State of Bauchi | 3rd |
Nigeria | NIMS | 2018/19 | 510 | States of Kebbi and Sokoto | 3rd | DHS | 2018 | 1,525 | States of Kebbi and Sokoto | 3rd |
Nigeria | PLAN (SHOW) | 2016 | 1,770 | Intervention and control districts in the states of Sokoto and Zamfara | 2nd | DHS | 2013 | 1,096 | States of Sokoto and Zamfara | 3rd |
Pakistan | NIMS | 2017 | 1,620 | Cities of Lodhran, Rajanpur, Jamshoro and Swabi | 2nd, 3rd | DHS | 2012.5 | 2,636 | Provinces of Punjab, Sindh, and Khyber Pakhtunkhwa | 3rd |
Pakistan | Red Cross | 2012 | 1,166 | Districts of Battagram and Swat and province of Khyber Pakhtunkhwa | 2nd, 3rd | DHS | 2012/13 | 1,532 | Province of Khyber Pakhtunkhwa | 3rd |
Pakistan | WV | 2017 | 942 | District of Sukkur | 2nd | DHS | 2012/13 | 1,591 | Province of Sukkur | 3rd |
Philippines | NIMS | 2018 | 1,418 | Provinces of Camarines Norte, Masbate, Antique, Iloilo, Cebu, Bohol, and Zamboanga del Norte | 3rd | DHS | 2017 | 352 | Provinces of Camarines Norte, Masbate, Antique, Iloilo, Cebu, Bohol, and Zamboanga del Norte | 3rd |
Senegal | PLAN (SHOW) | 2016 | 828 | Intervention/control districts in the regions of Dakar, Ziguinchor, Tambacounda, Kaolack, Louga, Kedougou and Sedhiou | 2nd | DHS | 2010/11 | 2,307 | Regions of Dakar, Ziguinchor, Tambacounda, Kaolack, Louga, Kedougou and Sedhiou | 4th |
South Sudan | CMMB | 2015 | 500 | County of Nzarai | 1st | MICS | 2010 | 770 | State of Western Equatoria | 3rd |
Tanzania | NIMS | 2017 | 215 | Regions of Mwanza and Simiyu | 4th | DHS | 2015/16 | 408 | Regions of Mwanza and Simiyu | 4th |
Tanzania | PLAN | 2017 | 3,207 | Region of Mbeya, and districts of Sumbawanga DC, Sumbawanga MC, Nkasi DC, and Kalambo DC (in the region of Rukwa) | 2nd, 4th | DHS | 2015/16 | 282 | Regions of Mbeya and Rukwa | 4th |
Tanzania | WV | 2017 | 1,476 | Region of Kigoma | 4th | DHS | 2015/16 | 245 | Region of Kigoma | 4th |
Tanzania | WV (ENRICH) | 2016 | 1,399 | Districts of Itigi, Manyoni, Ikungi, Kahma, Shinyanga, Kishapu and Ushetu | 2nd | DHS | 2015/16 | 556 | Regions of Shinyanga and Singida | 4th |
Vietnam | A&T | 2011 | 4,029 | Regions of North Central and Central Coastal area, Northern Midlands - Mountainous area, Central Highlands, Mekong River Delta | 4th | MICS | 2010/11 | 7,140 | Regions of North Central and Central Coastal area, Northern Midlands - Mountainous area, Central Highlands, and Mekong River Delta | 4th |
Vietnam | CARE | 2015 | 594 | Districts of Bao Lac, Tu Mo Rong, Que Phong and provinces of Nghe An, Cao Bang, and Kon Tum | 2nd, 3rd | MICS | 2013/14 | 4,095 | Regions of North Central and Central Coastal area, Northern Midlands - Mountainous area, and Central Highlands | 4th |
Vietnam | Oxfam | 2014 | 1,982 | Districts of Da Bac, Hoa Binh, Binh Gia, Lang Son, Phu Cu, and Hung Yen, and provinces of Hoa Binh, Hung Yen, and Lang Son | 2nd, 3rd | MICS | 2013/14 | 573 | Regions of Northern Midlands - Mountainous area, and Red River Delta | 4th |
Zambia | CARE | 2016 | 735 | Towns of Mpika and Shiwang'andu | 2nd | DHS | 2013/14 | 854 | Province of Muchinga | 3rd |
* 1st level represents village, town, locality or traditional authority; 2nd level: district or equivalent; 3rd level: province, state or equivalent; 4th level: region; 5th level: country.
DHS: Demographic and Health Surveys; MICS: Multiple Indicator Cluster Surveys; NGO: non-governmental organization.
We also retained the location of data collection (e.g. country, region, province, district, or/and village) and geographical level. These geographical levels of data aggregation were defined as: (1) the smallest geographical subdivision in a country (village, town, locality, traditional authority); (2) district or district council (larger than a village but smaller than the third level); (3) province, state, department, county or district (if it refers to a division equivalent to province or state); (4) region (combining several units of level 3); (5) country level.
We matched 25 DHS and 3 MICS surveys (from Vietnam, Laos, and South Sudan) with 46 NGO baseline reports (Table 1). We used the most recent DHS/MICS survey carried out prior to the NGO baseline survey, with some surveys matching more than one NGO survey.
Indicators from DHS/MICS were calculated following the methods recommended by DHS/MICS accounting for weighting and sample selection (Croft et al., 2018). Wherever possible, we used the methods employed by the NGO to create the matching DHS/MICS indicator. For instance, if the NGO baseline survey included women of reproductive age and their children aged 0-24 months living in the district of Homoine in Mozambique, we extracted the same sample from the DHS/MICS. In the absence of representative data from the same geographical level, we used DHS/MICS data from the next level up in the geopolitical hierarchy to match the lower level from the NGO. For instance, if data from the district of Homoine were not available in the DHS, we used data from the province of Inhambane (one level up).
We matched similar indicators from NGO baseline reports with DHS/MICS wherever available and excluded those that had no match in the DHS/MICS datasets. Table 2 provides an example of how the data were matched for the indicator “Woman received at least three antenatal care visits (ANC) during last pregnancy”.
* 1st level represents village, town, locality or traditional authority; 2nd level: district or equivalent; 3rd level: province, state or equivalent; 4th level: region; 5th level: country.
ANC: antenatal care; DHS: Demographic and Health Surveys; MICS: Multiple Indicator Cluster Surveys; NGO: non-governmental organization.
In total there were 129 indicators (Table 3) from eight main groups including child anthropometry, child diet, child health, household characteristics, household wealth, maternal characteristics, maternal health, and WASH. We excluded estimates based on fewer than ten observations (n=64), in either the DHS/MICS or NGO data, retaining a total of 1,996 pairs of NGO-DHS/MICS indicators for analyses.
* for a complete list of all the indicators see Table 2 in HealthBridge (2020).
HH: household; WASH: Water, Sanitation, and Hygiene; DPT: diphtheria, pertussis and tetanus; ORS: oral rehydration salts; ORT: oral rehydration therapy; SBA: skilled birth attendant; ANC: antenatal care; PNC: postnatal care; TT: tetanus toxoid.
After collating the data, we grouped similar indicators into 37 subgroups (Table 3) on the basis of whether they had similar definitions/concepts (e.g. stunting prevalence in different age groups). We refined the grouping by using scatterplots of the difference of estimates by year difference and geographical level difference to check if any indicators differed widely from others in the grouping. After assessing the indicators graphically, we separated “Diarrhea in the last two weeks: 0-5m” from the same indicator for other age groups since the differences of estimates were closer to zero for this age group than the others. We also separated “Household has a car” from the subgroup “Household has agricultural land/bike/phone” since car ownership was much lower than ownership of other assets.
NGO versus DHS/MICS
We subtracted NGO from DHS/MICS estimates to calculate difference and absolute difference between estimates.
To compare data from NGO and DHS/MICS we used: same or different season of data collection; number of years difference between data collection (DHS/MICS year - NGO year); and number of geographical levels difference (DHS/MICS level - NGO level). If data collection spanned two years, for instance data collection started in 2013 and was completed in 2014, the year of data collection was coded as “2013.5”. Geographical level difference was calculated by subtracting the NGO level from DHS/MICS level. For example, we subtracted district level data available from the Mozambique NGO survey (level=2) from province level data collected in the DHS (level=3), making the geographical level difference one. We grouped geographical level differences as: no difference; one level difference; 2-3 levels difference.
We plotted how difference and absolute difference between DHS/MICS and NGO estimates varied with the indicator and indicator grouping. We used Analysis of Variance (ANOVA) to partition the variance of difference or absolute difference between estimates (DHS/MICS estimate - NGO estimate) by indicator, geographical level difference (as 0,1,2+), year difference (continuous), and season (same season, different season, season unknown).
DHS versus DHS
In order to better understand the contribution of difference in methods employed in the different sources of survey data (DHS/MICS and NGO) to the resulting difference in estimates, we repeated the analyses used to compare DHS/MICS and NGO estimates but this time comparing DHS data from one country, year and geographical level to a different year and/or geographical level from the same country. The assumption is that the DHS methods are similar between years and geographical levels, whereas DHS/MICS and NGOs may use somewhat different methods. There is a level of discordance between DHS/MICS and NGO estimates, and there would also be discordance between two DHS estimates. The difference between DHS/MICS-NGO discordance and DHS-DHS discordance will not be due to difference in years, or geographical levels, but rather due to difference in methods.
For the DHS-DHS comparisons, we compiled DHS data from the seven countries that contributed the most pairs in the DHS/MICS-NGO dataset: Bangladesh, Ethiopia, Kenya, Malawi, Pakistan, Tanzania, and Zambia. Retaining the same indicators as in the DHS/MICS - NGO comparisons, we calculated estimates for different geographical levels, i.e. at the country level, and for each region, province and district available. For this analysis, we included district data to mimic the NGO data, even though these estimates are not always representative at this level in the DHS. We excluded indicators based on a sample size smaller than ten observations (n=26,539).
We matched DHS indicators from different cycles and geographical levels using different combinations mimicking the actual DHS/MICS-NGO scenarios: indicators from the same level but different years (Scenario 1), indicators from the same year but different levels (Scenario 2), and indicators from different years and levels (Scenario 3). To mimic the NGO data, we used data from the most recent cycle and the lower geographical levels, whereas to represent the comparative DHS data we used older DHS cycle and higher geographical level data. Using DHS data only, we were not able to simulate a scenario where DHS/MICS and NGO data were from the same year and geographical level. Table 4 provides an example of how we compared the estimates for an ANC indicator in Zambia using 31 pairs from DHS in the three scenarios for this one country. Repeating across all indicators and all countries yielded 109,251 pairs of DHS-DHS indicators.
We calculated the difference and absolute difference between these pairs of estimates, mimicking the scenarios from the DHS/MICS-NGO data. Table 5 summarises the DHS cycles included as well as the geographical level comparison for each scenario in each of the seven countries.
Finally, as with DHS/MICS vs NGO estimates, we used ANOVA to partition the variance of difference or absolute difference between DHS estimates by indicator, geographical level difference, and year difference. We did not include season in this analysis since most DHS data are collected during the same season within a country.
We simulated a situation where the only source of imprecision of the indicator’s measures would be from sampling error, in order to separate this known and estimable source of error from other sources of error that lead to differences in indicator estimates. The simulation samples from a "true" prevalence (p) of 1%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 99%. We assumed an n of 500, which was a typical sample size of both DHS and NGO samples in our data set. We then generated a “Baseline Estimate 1” (to mimic the DHS/MICS estimates) by drawing randomly from a binomial distribution with mean n*p and variance np(1-p). A “Baseline estimate 2” (to mimic the NGO estimate) was generated in the same way, and the difference between the first and second estimate was calculated. We ran 1,000 iterations to estimate the distribution of the differences.
In order to investigate how absolute differences vary by the nature of the point prevalence estimates we used box plots to compare simulated, DHS-DHS and DHS/MICS-NGO absolute differences.
All data were compiled in Microsoft Excel 15 and analyzed with SAS 9.4.
This study respects current research ethics standards and it was approved by the Health Research Ethics Board of the Université de Montréal (CERSES-19-030-D).
The NGO reports often presented over 100 indicators in their baseline reports. On average, 18 of their indicators were also available in the DHS/MICS datasets. The estimate sample size for the NGO surveys ranged from 12 to 16,530 and from 10 to 98,446 for the DHS/MICS. Table 6 presents, by indicator subgroup, mean DHS/MICS and NGO percentage prevalence estimates, mean difference between pairs (DHS/MICS minus NGO) and percentage of differences falling within 5 and 20 percentage points. Some subgroups have mean difference close to zero, but almost all have at least some pairs that are widely different (not within 20%). Fifteen subgroups had positive (DHS<NGO) and 21 had negative (DHS>NGO) mean differences, but we identified no meaningful pattern in which indicators were negative and which were positive, and all the differences (except for consumption of vitamin A-rich foods) were within 1 standard deviation of 0.
Figure 1 presents the scatterplots of NGO against DHS/MICS estimates by subgroup of indicators. For all subgroups, there was some correlation between the DHS/MICS and NGO estimates. Figure 2 shows the boxplot distribution of the mean difference between estimates by subgroup. The only subgroups that had all the pairs of indicators within ±20% were “Consumption of vitamin A-rich foods”, “Bottle fed yesterday”, “Diarrhea in the last two weeks: 0-5m”, “Diarrhea in the past two weeks: given more to eat”, and “Household has a car”. Other indicators that had most of their pairs within ±20% were “Household treats drinking water” and “Ever married”. All the indicators with the smallest differences between estimates had very low or very high prevalence (Table 6), except for “Consumption of vitamin A-rich foods” (that was based on only four pairs of estimates).
Abbreviations: BF: breastfeeding; HH: household; HF: health facility; SBA: skilled birth attendant; ANC: antenatal care; PNC: postnatal care; DHS: Demographic and Health Surveys; MICS: Multiple Indicator Cluster Surveys; NGO: non-governmental organization.
Abbreviations: Anthros: anthropometry indicators; HH: household; WASH: Water, Sanitation, and Hygiene; BF: breastfeeding; HF: health facility; SBA: skilled birth attendant; ANC: antenatal care; PNC: postnatal care; DHS: Demographic and Health Surveys; MICS: Multiple Indicator Cluster Surveys; NGO: non-governmental organization.
Table 7 summarizes the absolute differences between DHS/MICS and NGO, and between DHS and DHS. They are summarized according to the similarity of data collection timing (year and season), geographical level, and sample size. Using the absolute difference enabled us to see the size of the difference without taking the direction into account. The absolute difference between DHS/MICS and NGO estimates increases as year difference increases, as geographical levels difference increase, and as sample sizes decrease. The differences between DHS and DHS show similar patterns in terms of broad geographical level, sample size, and ≥3.5 years versus 0 to 3 years’ time differences.
DHS/MICS vs NGO | DHS vs DHS | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Variable | N | Mean | SD | Median | IQR | N | Mean | SD | Median | IQR |
Year difference | ||||||||||
≤1 year | 495 | 11.6 | 10.4 | 9.2 | 12.6 | 56185 | 10.1 | 10.2 | 6.9 | 11.7 |
1.5-3 years | 860 | 12.8 | 12.8 | 8.4 | 15.9 | 8024 | 9.3 | 9.1 | 6.6 | 10.7 |
≥3.5 years | 641 | 13.8 | 13.2 | 10.1 | 15.1 | 45042 | 13.6 | 13.9 | 9.2 | 15.3 |
Season | ||||||||||
Same season | 1153 | 13.1 | 12.8 | 9.0 | 14.4 | - | - | - | - | - |
Different season | 603 | 11.8 | 11.2 | 8.5 | 14.6 | - | - | - | - | - |
Season unknown | 240 | 14.2 | 13.0 | 10.3 | 16.2 | - | - | - | - | - |
Geographical level difference | ||||||||||
0 | 677 | 12.5 | 12.6 | 8.3 | 14.2 | 9024 | 10.1 | 11.5 | 6.2 | 11.4 |
1 | 897 | 13.1 | 12.3 | 9.6 | 15.4 | 30275 | 10.5 | 10.9 | 7.1 | 11.9 |
2+ | 422 | 12.8 | 12.2 | 9.0 | 14.9 | 69952 | 12.1 | 12.4 | 8.2 | 13.8 |
Geographical level 1a,b | ||||||||||
Country | 14 | 7.7 | 7.7 | 3.9 | 6.1 | 61230 | 11.9 | 12.2 | 8.1 | 13.7 |
Region | 1259 | 13.1 | 12.9 | 9.0 | 15.9 | 25248 | 10.9 | 12.2 | 7.0 | 12.0 |
Province | 723 | 12.4 | 11.4 | 9.3 | 13.4 | 22773 | 10.8 | 10.8 | 7.5 | 12.6 |
Geographical level 2c,d | ||||||||||
Country | 14 | 7.7 | 7.7 | 3.9 | 6.1 | 896 | 7.0 | 8.7 | 3.9 | 7.2 |
Region | 369 | 12.6 | 12.2 | 8.7 | 14.2 | 8826 | 11.3 | 13.1 | 7.1 | 12.7 |
Province | 422 | 12.5 | 12.4 | 9.0 | 13.7 | 30875 | 9.1 | 9.7 | 5.9 | 10.1 |
District | 963 | 13.0 | 12.3 | 9.3 | 15.2 | 68654 | 12.6 | 12.6 | 8.8 | 14.5 |
Village | 228 | 13.5 | 13.3 | 8.6 | 17.6 | - | - | - | - | - |
Sample size 1a,b | ||||||||||
Tertile 1 (na=335, nb=709) | 663 | 14.1 | 13.1 | 9.8 | 16.6 | 36418 | 11.5 | 12.1 | 7.8 | 12.6 |
Tertile 2 | 656 | 12.9 | 12.4 | 9.3 | 15.0 | 36695 | 11.2 | 11.7 | 7.5 | 12.9 |
Tertile 3 (na=772, nb=5282) | 677 | 11.6 | 11.5 | 8.2 | 13.3 | 36138 | 11.7 | 12.1 | 7.8 | 13.8 |
Sample size 2c,d | ||||||||||
Tertile 1 (nc=236, nd=37) | 664 | 14.8 | 13.7 | 10.4 | 17.5 | 36480 | 13.7 | 13.0 | 10.0 | 14.9 |
Tertile 2 | 668 | 12.0 | 12.0 | 8.1 | 14.0 | 36407 | 11.7 | 11.9 | 8.0 | 13.1 |
Tertile 3 (nc=757, nd=104) | 664 | 11.7 | 11.1 | 8.7 | 13.4 | 36364 | 9.0 | 10.4 | 5.4 | 10.3 |
Table 8 shows the partition of variation results from DHS/MICS vs NGO and DHS vs DHS comparison. For DHS/MICS vs NGO about 15% of the variance was attributed to the indicator and less than 1% attributed to geographical level, year and season difference. For DHS vs DHS, geographical level and year account for more variation in absolute difference (1.25 and 4.5% respectively). However, in all cases, most (>82%) of the variance was unattributed, that is, it remained unexplained by the model.
Results from all three comparisons, DHS/MICS - NGO, DHS - DHS, and Simulations, are shown in Figure 3 as boxplots of the absolute difference between estimates by the indicator reference value (the DHS estimate or the estimate simulating DHS). The distribution of absolute differences is similar between DHS/MICS - NGO and DHS - DHS, with DHS/MICS - NGO showing only a slightly larger spread. For all three types of comparisons, the distribution of the absolute difference between estimates is narrower in the extremes and larger when the reference value is between 35% and 65%. Since the simulated sampling error differences are small (range <10%), only a small proportion of the differences can be attributed to sampling error.
Absolute difference between estimates calculated as:
Simulation: Simulated estimate 1 - Simulated estimate 2
DHS vs DHS: DHS estimate - DHS mimicking the NGO estimate (lower geographical level, more recent year of data collection)
DHS/MICS vs NGO: DHS/MICS estimate - NGO estimate
Reference value: DHS or the estimate mimicking DHS (higher geographical level, earlier year of data collection)
Abbreviations: DHS: Demographic and Health Surveys; MICS: Multiple Indicator Cluster Surveys; NGO: non-governmental organization.
Our study showed that many indicators presented large differences between NGO and DHS/MICS estimates. Almost all indicators had at least some pairs that were widely different. Only about 33% of the pairs of indicators were within 5%, and about 80% of the pairs of indicators were within 20%. Agreement between indicators was higher when comparing indicators that had low or high prevalence (e.g. <15% or >85%), which is consistent with sampling theory, but throughout the prevalence range, the distribution of differences in the DHS/MICS-NGO and DHS-DHS comparisons is larger than that found from sampling error alone (reflected in the simulation distribution). An NGO could obtain an accurate estimate using DHS/MICS data for indicators with expected values close to 0% or 100%.
We had hoped that if DHS/MICS and NGO estimates were similar, then NGOs could forego baseline data collection and use as a substitute DHS/MICS estimates, or estimates from some other publicly available dataset instead, saving NGO time and money, and reducing respondent burden. While we cannot give a blanket recommendation that DHS and MICS could always replace NGO baseline surveys, there are at least some situations where DHS/MICS could be used to the NGO’s advantage: when the estimate is expected to be less than 15% or above 85%; when the indicator of interest is one of the few with consistent similarity between DHS/MICS and NGO estimates; and when the NGO has tolerance for estimates of low or unknown accuracy.
We had hypothesized that publicly available data can provide estimates of baseline conditions similar to those reported in NGO baseline reports when matched as closely as possible for location, year, and season of data collection. From the descriptive analyses, we found that as year difference increased, the mean difference between estimates slightly increased, and estimates derived from lower geographical levels (such as village or district from NGO and province for DHS/MICS) contributed to a higher mean absolute difference between estimates. In general, larger sample sizes were obtained at higher geographical levels and the larger the sample size (with their smaller sampling error) from DHS/MICS or NGO, the smaller the mean absolute difference between estimates. This meant that the advantage of geographical proximity is offset by the larger sampling error associated with small sample sizes. Whether the seasons of data collection were matched or different did not make a measurable difference to the similarity between estimates.
However, the partition of variance analyses showed that DHS/MICS and NGO estimates differed, for the most part, in unpredictable ways, and geographical levels, years difference and seasons explained only a small part of the variation.
We hypothesize that large differences between estimates from NGO baseline reports and DHS/MICS data are due to three main reasons:
(i) It is possible that NGOs’ estimates are collected from different populations with different underlying true values. NGOs often try to target lower wealth villages, and so baseline estimates may be worse off than the nationally representative DHS/MICS estimates. Note, however, that differences in household wealth indicators were small (e.g. “Household has electricity” 0.8% difference; “Household has a car” 0.2% difference). Additionally, the differences between DHS/MICS and NGO estimates might reflect actual changes over the years or across different geographical locations. Results from the analyses comparing data from the same source (DHS) but from different years and geographical levels also resulted in large differences between estimates.
(ii) Different methods employed while sampling, collecting, processing and analyzing data might also have contributed to the differences between DHS/MICS and NGO estimates.
(iii) Several indicators related to maternal and child health included in this study have not been validated and some have been shown to have low validity, such as maternal report of skilled birth attendance (Blanc et al., 2016). Inappropriate conflation of answer options and inconsistent coding and analysis of DHS surveys has also been documented (Footman et al., 2015). High measurement error can result in bias in unpredictable direction and dimension, resulting in large differences between estimates.
Whatever the cause of the large differences between estimates was, it was not possible to know which of the data sources (DHS/MICS or NGO) provided the most accurate estimation of the true prevalence in the NGOs target populations. Furthermore, while we have been comparing DHS/MICS and NGO point estimates, these indicators are measured with error. The standard error (SE) for the DHS indicators is greater than 5% in eleven percent of the estimates. An estimate with a standard error of 5% will have a 95% confidence interval of ± 9.8%.
Our analyses document and try to understand the large differences between NGO and DHS/MICS estimates. However, a study comparing DHS data to a small population-based survey from Rwanda showed that nine out of fifteen indicators related to maternal, newborn and child health were within a 10% difference (Langston et al., 2015). Similarly, in case studies from Nepal and Vietnam (HealthBridge, 2020) there were many indicators where the DHS/MICS and NGO estimates were similar. In Nepal 70% of indicators were within 20% of one another. Estimates for ANC, iron-folic acid uptake, vitamin A supplementation at 18-23 months and mobile ownership were similar while breastfeeding, child dietary diversity and tetanus vaccination in pregnancy differed widely. In contrast, in Vietnam NGO estimates for exclusive/continued breastfeeding and dietary diversity at 6-8 months were close to DHS, while others differed by >30%. Using secondary data may be useful, especially in situations of budget or mobility restraint, such as during the COVID-19 pandemic with limited data collection opportunities. However, use of DHS surveys may risk underestimating the scale of problems for poor and marginalised groups such as nomads or slum dwellers (Carr-Hill, 2017). When using DHS/MICS data, the user must keep in mind the potential differences between DHS/MICS and NGO estimates.
This study had some limitations. Most NGO data we used came from unpublished, not peer-reviewed reports created for internal use only. Indicators extracted from NGO reports were not necessarily consistent across all reports and often SDs or SEs were missing. Although, we matched the methods employed by the NGO as closely as possible in order to obtain the same indicators from DHS/MICS, some reports provided limited information concerning methods of data collection and analysis. Dates of and season of data collection were impossible to assess for eight reports. Assigning the geographical level of data from the NGO report was also challenging for some settings due to lack of contextual information. However, we were able to communicate with several NGOs in order to obtain supplementary information about the reports’ methods.
Our hypothesis was that publicly available data can provide estimates of baseline conditions similar to those reported in NGO baseline reports when matched as closely as possible for location, year, and season of data collection. Our answer to this, in brief, is that publicly available data can be used, if the NGO is tolerant of imprecise estimates.
While an NGO may use the evidence presented here to justify forgoing their own baseline survey, they should keep in mind that DHS and MICS provide estimates for only some of the indicators of interest to the NGO. On average, we estimated 18 of the NGO’s indicators using DHS/MICS, but NGOs were often reporting 100+ estimates. Furthermore, collecting data in the NGO working area can provide valuable insights for project design and implementation.
This study used data owned by the DHS, the MICS and the NGOs that shared their baseline report. The DHS data can be downloaded at: https://www.dhsprogram.com, and the MICS data can be obtained at: https://mics.unicef.org. The DHS and MICS require registration and data access are only granted for legitimate research purposes.
The NGO reports were either available online on each NGO website or obtained by personal contact by email. The full list of NGO reports used in this study including report title, year of publication, organization name and how to access each report can be found at:
Harvard Dataverse: Details on reports used in the Maxdata project. https://doi.org/10.7910/DVN/32FUQV (Berti, 2021).
Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).
We thank all non-governmental organizations that shared their baseline reports with us, and USAID and UNICEF for making the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS) data available. We thank Bana Salameh for assisting with extraction of data from the NGO reports.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
Partly
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Modeling, program evaluation, estimation procedures
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: I have been involved in several baseline and end-line surveys for projects in pubic health conducted in developing countries and have witnessed the pros and cons of conducting them. My engagement has been in the planning, design, data analysis and reporting phases. I have also used findings from the DHS/MICS to guide project design, implementation and evaluation.
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 1 04 Feb 21 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)