Could malaria explain the global distribution of the angiotensin converting enzyme I/D polymorphism? A systematic review and ecological study [version 1; peer review: 1 approved with reservations, 1 not approved]

Background: The D-allele of the angiotensin converting enzyme (ACE1) has been linked to an increased risk of certain diseases including hypertension and COVID-19 but a decreased risk of cerebral malaria. We hypothesized that malaria played a role in determining variations in the global distribution of ACE1 I/D polymorphism. Methods: A systematic review was conducted to summarize the frequency of ID/DD genotypes in all countries with available data. Results: The ID/DD genotype frequency was found to be highest in Africa (86.4%, IQR 83.6-94.7%) and Eastern Mediterranean (median 84.5%, IQR 78.3-89.8%) and lowest in South East Asia (55%, 49.567.8%) and Western Pacific (61.1%, IQR 55.0-67.2%). Linear regression revealed positive associations between ID/DD genotype frequency and the incidence of malaria, malaria mortality as well as hemoglobin S allele frequency (all P<0.05). Conclusions: Our findings are compatible with the hypothesis that malaria played a role in establishing the differential frequency of the D-allele.


Summary
The D-allele of the angiotensin converting enzyme (ACE1) has been linked to a number of diseases including hypertension, obesity and most recently COVID-19. Previous research has suggested that the prevalence of this allele may vary between world regions. Because the D-allele has been shown to be protective against cerebral malaria, we hypothesized that malaria played a role in determining variations in the global distribution of ACE1 I/D polymorphism. We conducted a systematic review of published literature to estimate the prevalence of the D-allele by country and World Health Organization region. The D-allele was found to be most prevalent in Africa and least prevalent in the Western Pacific. A positive association was found between the prevalence of the D-allele and three markers of malaria burden of disease. Our findings are thus compatible with the hypothesis that malaria played a role in establishing the differential frequency of the D-allele.

Introduction
The renin-angiotensin system plays an important role in the regulation of cardiovascular, renal and immune physiology [1][2][3] . The angiotensin converting enzyme-1 (ACE1) converts angiotensin I to angiotensin II which is a key effector molecular in this system 2 . The ACE1 gene insertion (I) and a 287 bp Alu repeat deletion (D) polymorphism explain a large proportion of the individual variation in ACE1 levels both in serum and tissue 2 . ACE1 DD homozygotes and ID heterozygotes have approximately 65% and 30%, respectively, higher ACE1 levels in serum ACE1 than II homozygote individuals 3,4 . It is unknown why the frequency of this polymorphism varies between different populations around the world 5,6 . This increased ACE1 expression is thought to partly explain the association between the D-allele and an increased risk for diseases such as hypertension, heart failure, cerebrovascular disease, diabetic nephropathy, arthritis, various cancers, asthma, acute respiratory distress syndrome (ARDS) and possibly COVID-19 incidence and mortality 3,5,7,8 . Increased ACE1 in D-allele carriers generates higher angiotensin II levels, whose stimulation of the AT2 1 receptor then activates vasoconstriction as well as a number of pro-inflammatory and fibrotic pathways which increase the risk of these conditions 3,7-10 . The D-allele has been found to be associated in case-control studies with protection against progression to severe malaria 11-13 . In vitro and in vivo studies have demonstrated that angiotensin II can reduce the entry of plasmodia into erythrocytes and protect the blood-brain barrier against invasion by plasmodia 12-15 . These findings have led a number of authors to propose that differential intensity in malaria exposure may have driven variations in the prevalence of the D-allele 13,14 in a similar way to how malaria influenced the prevalence of haemoglobin S (HbS), and other polymorphisms associated with different susceptibility to malaria 16 . In this study, we aimed to assess the association between D allele frequency and malaria using three indicators of malaria exposure: malaria incidence, malaria mortality and the prevalence of the HbS allele 12 .
To do this we first conducted a systemic review of the frequency of the ID/DD genotypes in different countries around the world. A prior systematic review measured D allele frequencies by ethnicity, sex and age and reported D allele frequencies of 56.2% in "whites', 60.3% in 'Blacks' (60.3%) and the lowest (39.1%) in 'Asians' 6 . Results were similar for when DD genotype was measured. This study did not evaluate the geographical variation of the I/D polymorphism.

Methods
Systematic review of ACE1 DD and ID/DD prevalence In April 2020, we used PubMed and Google Scholar to conduct a systematic review with the objective of obtaining country-level ACE1 estimates of DD and ID+DD frequency. We used the following MeSH terms: 'ACE' OR 'angiotensin-converting enzyme' AND ('polymorphism' OR 'deletion' OR 'insertion'). The following inclusion criteria were used: sample size of at least 50; published in English and the population could be considered representative of the general population (the studies were typically the control groups of case control studies evaluating the associations between I/D polymorphisms and specific diseases). Studies from previous systematic reviews of associations between D-allele frequency and various diseases could be included if they met the entry criteria. No date restrictions were used. Duplicate studies were removed ( Figure 1). The data was extracted by a single author (CK), and manually checked for accuracy. Risk of bias was not assessed. The principle summary measure used was the prevalence of the genotype of interest expressed as a percentage. The PRISMA guidelines were followed for the manuscript (see Extended data, STable 1 17 ).
HbS: The HbS allele frequency (expressed as a percentage) was obtained from a study that estimated national HbS allele frequencies for 190 countries 18 .

Malaria incidence:
The national incidence of malaria per 1,000 population at risk for the year 2000 (the earliest year with available data), was obtained from the World Health Organization (WHO), Global Health Observatory Data Repository. A value of zero cases per 1,000 was used for countries classified as having eliminated malaria in 2000.

Malaria mortality.
The age-standardized national number of deaths from malaria per 100,000 individuals. The data was obtained from the International Health Metrics and Evaluation Global Burden of Diseases study for the year 1990 -the first year with available data.

Geographical grouping.
For the global analyses, countries were classified into 6 regions according to the WHO system.
A previous study limited to European countries found an ecological association between COVID-19 incidence and D-allele frequency, with both having higher values in Southern European countries 5 . This provided the motivation to compare the ID/ DD prevalence between European regions. We used the four European Regions as defined by the United Nations Geoscheme for Europe for this purpose. # Statistical analysis. Linear regression was used to evaluate the country-level association between the prevalence of ID/ DD genotypes and the three measures of intensity of malaria exposure. Differences in ID/DD genotype prevalence between world regions were assessed via the Wilcoxon rank-sum test. The analyses were performed in STATA version 16 (Stata Corp, College Station, Tx). The maps were produced with Data Wrapper v2.

Results
Systematic review of ACE1 DD and ID/DD prevalence Searches of the literature identified 5099 references. The application of the selection criteria restricted the total to 248 informative references published between 1993 and 2020, providing 253 prevalence estimates from 71 countries ( Figure 1; Extended data, STable 2 17 ). Of these, 7 were located in the Americas, 15 in Eastern Mediterranean, 26 in Europe, 8 in sub Saharan Africa, 5 in South East Asia and 10 in Western Pacific ( Table 2).

Discussion
Differential exposure to malaria has been found to play a role in driving the large geographical variations in the prevalence of various haemoglobinopathies, thalassemias, glucose 6-phosphate dehydrogenase deficiencies, erythrocyte membranopathies and various mediators of inflammation and immunity 19,20 . We found country-and regional-level differences in the prevalence of the ACE1 D-allele that were associated with malaria incidence and mortality and HbS allele. These findings add into previous individual-level studies showing protection against severe malaria by the D allele, suggesting that malaria may have played a role in the global variations in D-allele frequency.
An important weakness of the study is our use of malaria incidence and mortality data from 1990 and 2000. Although   this was the earliest data we had available, it underestimates the historical burden of malaria for much of the world's population up to the beginning/mid-20 th century as shown in Figure 4 20 . This figure of malaria prevalence from the preintervention period (around 1900) was produced by Lysenko in 1968. Unfortunately it has not been translated into estimated numerical values of national prevalence that could be used to test numerical associations 21 . Although we did not test this statistically, it does appear that a number of the countries with high D-allele prevalence but low malaria incidence in 2000 in our analysis had a high malaria prevalence in 1900 (Figure 3 and Figure 4) 20 . For example, all 10 countries with ID/DD prevalence of above 81% and malaria incidence of 0 in 2000 were classified as being at least hypoendemic for malaria transmission in 1900 according to Lysenko 21 . Our finding of higher prevalence of the D-allele in Southern than Northern Europe is also    commensurate with the higher historical prevalence of malaria in this part of Europe 20,21 .
Another weakness of our study is the small number of countries with data for D-allele frequency and malaria burden of disease. We also acknowledge the fact that some of the differences in ID/DD frequency between studies and countries may be attributed to differences in how control groups were selected, differences in age distribution and in the methodology used to genotype individuals 6 . It is also important to note that recent genome wide association studies have not found the ACE1 I/D polymorphism to be associated with severe malaria 22,23 . The lack of an association in these studies may however be related to the high prevalence of the D-allele in malaria endemic areas, as well as the high level of genetic diversity and weak linkage disequilibrium in Africans compared to elsewhere 22 . The association between malaria and the D-allele could be explained by confounding. It is possible that the actual driver of the D-allele is an environmental factor such as another infection that is associated with malaria. Finally, not all studies we drew our data from reported if the I-and D-alleles were in Hardy-Weinberg equilibrium. Including data from studies where these alleles were not in Hardy-Weinberg equilibrium would, however, be expected to introduce a non-differential misclassification bias into the analysis 24 . This would be expected to dilute the strength of association between the prevalence of the D-allele and malaria severity.
Given these concerns, we can only conclude that our data is compatible with the hypothesis that malaria played a role in establishing frequency distribution of the D-allele. This association could provide a potential evolutionary explanation for the commonly observed higher prevalence of hypertension in individuals with African genetic background 25,26 . If the association between the D-allele and COVID-19 incidence/morbidity is confirmed, our mapping of D-allele prevalence could assist with estimating morbidity and mortality in different populations.

Data availability
Underlying data All data underlying the results are available as part of the article and no additional source data are required. This project contains the following extended data:
• STable 2. Study characteristics of studies included in systematic review, including frequency of II, ID and DD genotypes of ACE1, arranged according to country and year of study.
• STable 4. Linear regression assessing the countrylevel association between the prevalence of the DD genotypes of the ACE1 gene and malaria incidence per 1,000, malaria mortality per 100,000 and the allele frequency of hemoglobin S. Authors contributions CK conceptualized the study, was responsible for the acquisition, analysis and interpretation of data including the systematic review and wrote the analysis up as a manuscript. AR assisted with the study design and analysis. Both authors contributed to and approved the final draft.

Alfred Amambua-Ngwa
Medical Research Council Unit, London School of Hygiene and Tropical Medicine, Fajara, UK The authors present a somewhat systematic review of publications to determine the relationship between ACE1 ID/DD alleles and malaria. The authors used malaria incidence rather than prevalence information as well as the distribution of Hbs and malaria mortality. The review has a strong rationale given the suspected role of ACE1 in COVID-19 and associations with high blood pressure in populations of African descent. However, there are several limitations with the data gathering and analysis process that does not render full credibility to the conclusions of malaria being a contributing factor to the prevalence of the ID/DD alleles: Data curation by a single author is not in line with standards for systematic reviews.

○
The authors have not clearly stated the number of different publication types that were included, short communications, letters, posters, reviews, and conference abstracts. The method seemed limited by the number of databases or sources used for the search and the numbers in the flow chat need further clarification on how some publications were eliminated. For example, why were only 312 texts assessed for eligibility out of the 402 texts obtained after screening? Why were the final 248 texts included after excluding 90 out of the 312 assessed? ○ Both first (Title/abstract) and second (full text) screening were done by one author which might have biased the screening strategy. Moreover, the author CK seems to have done everything in the manuscript, and this questions the contribution(s) of the second author.
○ It is not clear how the authors assessed the criteria of "the population being a representative of the general population" in the selected 248 articles. This is important because allele frequencies depend on census sizes from the population. Yet the authors did not provide country-level information on the number of samples. This reflects a general flaw in the statistics as results are presented per sub-region. Country-level malaria incidence for example can be widely different within sub-regions and so country-level data would have been more informative. Given that the frequency of the D allele is almost fixed in most of Africa and equally high in malaria-free regions, the regression in my opinion ○ cannot be robust. No explanation for example was given for the lower frequency in South East Asia where there is historically a higher prevalence of malaria than in Europe.
From figure 3 there is a higher prevalence of the D allele in higher malaria prevalence regions, but the relationship is far from linear. Hence the coefficients presented cannot be accurate.
expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. In this study, the authors performed a systematic review to analyze the association between ACE1 I/D polymorphism and malaria. The topic is novel, but there are some serious defects that cannot be ignored. This study included previous systematic reviews regarding ACE1 I/D, which is inconsistent with the scientificity of a systematic review. Besides, the search strategy and databases could not ensure the comprehensiveness of the researches included in this study, and the included 248 literatures were neither listed in this paper nor in the supplement. The data was extracted by only a single author, this is inconsistent with the basic requirement of data extracting method in systematic reviews. Moreover, the author did not evaluate the publication bias and method quality of the included studies, which may influence the reliability of the conclusion.

1.
For the systematic review of gene polymorphism, Hardy-Weinberg test was not performed in this study. Whether the included researches meets Hardy-Weinberg test is important to reflect the representativeness of the population.

2.
There are defects in statistical analysis. Linear regression was used to evaluate the countrylevel association between the prevalence of ID/DD genotypes and the three measures of intensity of malaria. However, the scatter plots ( Figure 3) could not reflect the feasibility of linear regression, and the authors did not explain whether the data meet the assumption of linear regression. Besides, the prevalence of the ACE1 ID/DD genotypes should be obtained only from the healthy population, but the authors did not display the data resource in the figures or tables. Additionally, in Table 1 and Table 2, there is a methodological error to analyze the association between gene polymorphisms and malaria in other regions based on the region with the largest sample size, which may obtain fundamentally wrong results and conclusions.

3.
The results of ID/DD genotype were analyzed in the study. Although it was mentioned in the last sentence of the results section that "Results were similar when these regressions were run with DD frequency as the outcome variable", it could not be concluded that the D allele is associated with malaria. It is suggested that on the basis of overcoming the above 4.