HIV prevalence correlated with circumcision prevalence and high-risk sexual behavior in India's states: an ecological study

Background: HIV prevalence varies between 0% and 1.6% in India's states. The factors underpinning this variation are poorly defined. Methods: We evaluated the relationship between HIV prevalence by state and a range of risk factors in the Indian 2015 National Family Health Survey. Pearson’s correlation was used to assess the relationship between HIV prevalence and each variable. The prevalence of each risk factor was compared between five high-HIV-prevalence states (>1% prevalence) and a large low-HIV-prevalence state (Uttar Pradesh; HIV prevalence, 0.06%). Results: There was an association between HIV prevalence and men's mean lifetime number of partners (r = 0.55; P = 0.001) and men reporting sex with a non-married, non-cohabiting partner (r = 0.40; P = 0.014). In general, men in high-prevalence states were less likely to be circumcised and (with the exception of Chandigarh) use condoms at last sex. In two high prevalence states (Mizoram and Nagaland), men reported a higher number of lifetime partners and a higher prevalence of multiple partners and high-risk sex in the past year. Conclusions: Variation in circumcision prevalence and sexual behavior may contribute to the large variations in HIV prevalence by state in India.


Introduction
There is little consensus as to whether or not differences in sexual behavior play an important role in determining the large differences in HIV prevalence between populations. Certain authors have claimed that non-behavioral factors such as differences in prevalence of herpes simplex virus 2 infection, circumcision rates and STI treatment efficacy are responsible for differences in HIV prevalence 1,2 . Other authors have found that HIV prevalence is associated with high-risk sexual behavior 3-5 . These latter studies have included ecological-level analyses. The rationale behind ecological studies is that the prevalence of sexually transmitted infections (STIs) is, to an important extent, determined by the structure of the local sex network 6 . This is a population-level characteristic and thus ecological studies are required to assess if correlates of this network structure are associated with HIV prevalence 7 . A number of studies 3,5,8,9 , but not all studies 10 , have found that correlates of network connectivity, such as rate of partner change and the prevalence of sexual partner concurrency, are positively associated with HIV prevalence. These associations have been found to be strongest when comparing ethnic groups and regions within countries 3,5,8,9,11,12 . Nationally representative, HIVserolinked Demographic Health Surveys (DHS) have been a particularly useful resource for these studies 3,8,11 .
India is an interesting country to test the network connectivity thesis. HIV was first detected in India in 1986 and since then a range of sources have documented higher HIV prevalence in certain North East States and to a lesser extent Karnataka, Andhra Pradesh and surrounding areas 13-17 . Individual-level analyses have found that well-established risk factors, such as partner number, contact with sex workers, intravenous drug usage and lack of circumcision, were associated with HIV infection 13,17-20 . These analyses have, however, not explained the differential HIV prevalence by state. A number of authors have argued that the higher HIV prevalence in Northeast India may be due to patterns of intravenous drug use (IVDU) in this region. Differences in contact with sex workers, patterns of migration, gender inequality and non-traditional forms of sex work, such as devadasi in Karnataka (where women are 'married' to temple deities and have sex with temple attendees), are some of the other factors that have been advanced as reasons for differences in HIV spread in India, often with little supporting evidence 13,[16][17][18][19][20] . In this analysis we address this issue by assessing whether there is an ecological-state-level association between various HIV risk factors and HIV prevalence.

Methods
India is a federal union comprising 29 states and 7 union territories ( Figure 1). We refer to these 36 entities as states. The states are typically large and vary in surface area from Rajastan's 342,239km 2 to Goa's 3,702km 2 (median 88,752km 2 ). The Union Territories are far smaller (median size 491km 2 ) and for this reason are indicated with circles on the map in Figure 1. India's vast size incorporates a wide range of ethnic and liguistic groups. It has an estimated 2000 ethnic groups and classifies 23 languages as official 15 . There are considerable differences between states in ethnic composition as well as in education al attainment, life expectancy and poverty rates 14,15 .

Data source
We used the 2015 National Family Health Survey (NFHS-4) for this study. The NHFS-4 received ethical committee clearance for data analyses such as the one performed here. As a result, no specific ethics committee approval was necessary for this study.
The NFHS-4 used a household-based, two-stage stratified sampling approach. The first stage selected primary sampling units (PSUs) from the 2011 National Census. PSUs were villages in rural areas and census enumeration blocks in urban areas. In the second stage of the survey, 22 households were randomly selected with systematic sampling from every selected rural and urban cluster.
A total of 98% of selected households were successfully interviewed. In the interviewed households, women aged 15-49 and men aged 15-54 were eligible to participate. The individual response rate for women was 97% and 92% for men. The response rates were high in all states except for Delhi and Chandigarh ( Table 1). The behavioral questionnaire was administered in 17 local languages using computer-assisted personal interviewing. Further details of the survey are provided in Table 1 and the NFHS-4 report 14 .

HIV Testing and Prevalence estimates
In a random subsample of households, a dried blood spot from a finger-prick blood specimen was obtained from eligible women age 15-49 and men age 15-54 who consented to laboratory HIV testing. Respondents did not have access to the test results but were all referred to counseling and testing services in the local area. Dried blood samples were collected and subsequently tested for HIV using a Microlisa ELISA. Positive tests and a random 2% sample of negative samples were then tested with SD Bioline 1/2 ELISA (Standard Diagnostics Inc., Kyonggi-do, South Korea). Discordance between the two tests was resolved with Western Blot (BioRad). With the exception of Delhi, HIV testing rates were high in all states (median 95.9%, IQR 94.4-97.9%). Coverage was higher in women (95%) and men (90%) in rural areas as well as urban areas (women, 91%; men, 84%).
The NFHS-4 was designed to provide HIV prevalence estimates that are representative for the women (15-49 years) and men (15-54 years) for the whole country and rural and urban areas. It also oversampled certain states so as to be able to generate representative samples for 11 groups of states (Table 2). These were states that had been found to have higher HIV prevalence rates in previous surveys, as well as the Uttar Pradesh-centered group of

Amendments from Version 1
The queries raised by the peer reviewers have been addressed as follows. Table 2 and Table 3 have been merged as suggested. The legend for Figure 1 has been amended as suggested. More background information has been provided on the topics requested. Other small edits have been made to make each table and figure interpretable on its own.     c Highest education level attained: Percent whose highest educational attainment is primary on no schooling. d Percent in the Poorest Wealth Quintile: this variable refers to the percent of the respondents from this region that were calculated to fall in the poorest 20% (quintile) of the nationally derived wealth band. The wealth quintiles were derived from an asset index.
e Lifetime number of partners refers to the mean number of lifetime partners reported by those who reported being sexually experienced, excluding those with more than 90 partners.
states, which was chosen as a low-HIV-prevalence comparator. These 11 groups of states consisted of composites of vast populations with considerable heterogeneity in sexual behaviors and circumcision prevalence (Table 1). Our primary research question involved assessing if HIV-related risk factors differed between high-and low-HIV-prevalence states. To avoid problems related to averaging out large differences in prevalence of risk factors in the groups of states, in our primary analysis we compared risk factors with estimated HIV prevalence by state. In a sensitivity analysis this analysis was repeated using the 11 groups of states.
High HIV prevalence states. UNAIDS defines populations with a 15-49-year-old HIV prevalence of greater than 1% as a generalized HIV epidemic 22 . We therefore defined states with a 15-49-year-old HIV prevalence of above 1% as high-HIVprevalence states.
Low HIV prevalence comparator state. Uttar Pradesh was chosen as the low-HIV-prevalence comparator state for a number of reasons. The previous HIV-serolinked NHFS, as well as the current NFHS survey, found an adult HIV prevalence of less than 0.1% in Uttar Pradesh 15 . It also has the largest population of any state in India and as a result it had the largest sample size of all states in NFHS-4. In sensitivity analyses we repeated the analyses using all Indian states with a HIV prevalence below 0.2% as the low HIV prevalence comparator population.

Independent variables
Each of following predictor variables were calculated separately for men and women and were limited to those between the ages of 15-49 years for women and 15-54 years for men.
• Condom use at last sex: The percentage of respondents who reported using a condom at last sex, amongst those who have had sex in the past 12 months.
• Male circumcision: The percentage of men who reported being circumcised.
• High-risk sex (sex with a non-married, non-cohabiting partner): The percentage of respondents who reported sex with a non-marital, non-cohabiting partner in the past 12 months, amongst all respondents who reported sex in the past 12 months.
• Multiple partners in the past year: The percentage with two or more sexual partners in the past 12 months amongst all respondents who reported sex in the past 12 months.
• Lifetime sex partners: The mean number of reported lifetime sex partners amongst all respondents who reported having had sex, excluding those with greater than 90 partners.
The reason for excluding those with 90 or more partners was that in a small number of states these individuals exerted a large effect on the mean number of lifetime partners. Our primary research question was whether or not there were population differences in lifetime number of partners for the majority of the population. The percent of the population with 90 partners or more was small in all states: men, median 0.17 (IQR 0-0.29); women, median 0.39 (IQR 0.2-0.89) ( Table 1). As a result we found it more informative to calculate mean lifetime number of partners excluding those with more than 90 partners. Two socioeconomic control variables were assessed: • Education attained: Percent of state respondents whose highest educational attainment is primary or no schooling.
• Poverty: Percentage of the respondents from this state that were calculated to fall in the poorest 20% (quintile) of the nationally derived wealth band. The wealth quintiles were derived from an asset index.

Statistical analysis
All analyses are ecological in nature and conducted with HIV prevalence by state or group of states as the outcome variable. The analyses were conducted using STATA 13.0 (College Station, TX) and were all adjusted to account for the complex sampling strategies of the survey using the survey (SVY) command. The analyses were stratified by gender. The two-sample Wilcoxon rank-sum test was used to assess if there was a difference in the median number of lifetime partners between the high-HIVprevalence states and Uttar Pradesh/low-HIV-prevalence states. Cohen's d was used to assess effect size. Histograms were used to depict the distribution of the number of lifetime sexual partners by gender and state. Chi-squared tests were used to assess differences in categorical variables. Pearson's correlation was used to assess the relationship between state-level HIV prevalence and the prevalence of each risk factor.

Demographic variables and HIV prevalence
An overview of the sample size, mean ages and other demographic characteristics of men and women by state are provided in

Comparison of risk factors between high and low HIV prevalence states
Circumcision. The prevalence of circumcision was low in all states, including Uttar Pradesh (19.1%), but was significantly lower in all the high-HIV-prevalence states (2.5-10.6%; all P<0.0005).
High-risk sex. The prevalence of reported high-risk sex in men was 8.3% in Uttar Pradesh. A higher proportion of men in Mizoram (20%) and Nagaland (15%) but a lower proportion in Andhra Pradesh (2.8%) reported higher risk sex (all P<0.0005). Reported proportions in women were low and did not differ between states.

Multiple partners in past year.
Men in Mizoram were more likely to report multiple partners (3.8%) in the past 12 months than Uttar Pradesh (1.5%) (P<0.005).  Figure 2).

Lifetime partners. Men/women in
Two sensitivity analyses were performed. Firstly, repeating the analyses using all Indian states with HIV prevalences below 0.2% as the low-HIV-prevalence comparator population (instead of Uttar Pradesh) had little effect on the results (Table 2). Secondly, repeating the analyses using the 11 groups of states in place of the 36 states produced comparable results (Table 3). Low circumcision and condom-usage rates together with lifetime number of partners and high-risk sex (men only) remained significant risk factors in group 7 states (Mizoram, Manipur, and Nagaland) compared to the Uttar Pradesh group (Uttar Pradesh, Madhya Pradesh, Uttarakhand, and Rajasthan).

State-level correlates of HIV prevalence
Controlling for education and poverty levels, the prevalence of higher-risk sex in men (r=0.40; P=0.017) and mean number of lifetime partners in men (r=0.55; P=0.001) were positively associated with HIV prevalence (Table 4).

Discussion
As in other countries, the spread of HIV has been heterogenous in India. Whilst the absolute differences in HIV prevalence by state are not large, the relative differences are. A state-level HIV prevalence above 1% is four times the national average prevalence. These differences in HIV prevalence are also fairly stable based on both data from the 2005 NFHS survey 15 and other data sources such as antenatal surveys from the 2000s until the present 13,16 . Previous studies have argued that the higher prevalence rates in certain states in Northeast India is predominantly due to patterns of IVDU 13,16 . We were unable to investigate the role of IVDU, but we found population-level differences in the prevalence of circumcision, condom usage and sexual behaviors, which could explain differential HIV spread.
The most consistent association we found was lower circumcision prevalence rates in the high HIV prevalence states. Circumcision rates were, however, as low or lower in a number of low HIV prevalence states as in the high HIV prevalence states. Although differences in condom usage were statistically significant, the absolute differences were small and one high-HIV-prevalence state had higher condom utilization than the low-HIV-prevalence comparators.
The relationship between the sexual behavioral risk factors and HIV prevalence was not uniform. Whilst sexual behaviors were in general riskier in the North East states, this was not the case in Andhra Pradesh and Chandigarh, where a number of the behavioural risk factors were actually less prevalent than Uttar Pradesh. In a country as vast and diverse as India, it should not be too surprising to find different combinations of risk factors to be responsible for HIV spread in different regions. As noted above, previous studies have suggested that various factors in Andhra Pradesh and Karnataka may play a role in the local HIV epidemics here 13,20 . An important finding of our study was the higher number of lifetime partners, multiple partners in the prior year and high-risk sex in Mizoram and Nagaland compared to Uttar Pradesh. Our study is thus compatible with the thesis that these behaviors would translate into a more connected sexual network which could play a role in the generation of higher HIV prevalence rates in these states.
These same three risk factors (number of lifetime partners, multiple partners in the prior year and high-risk sex) have previously been found to be associated with differences in HIV prevalence by ethnic group in Kenya 3 , South Africa 5 and elsewhere 8,23 . Previous reports from DHS data found that, at an individual level, there was a stepwise increase in HIV prevalence with increasing lifetime sex partners in all 15 countries with available data. This was true for both men and women in all cases 4,24 . This association was also present in the Indian NFHS-3 in 2005 and in this survey 14,15 . That this association was present at both individual and population levels increases the probability that the association between lifetime sex partners and HIV prevalence is real 25,26 . In our study, this association was, however, only statistically significant for men. High-risk sex has also been found to be an individual level risk factor for HIV infection in a number of a number of countries, including India 4,14 . As in other areas, it is likely that these patterns of higher risk sexual behavior interact with other risk factors such as IVDU to produce the observed differences in HIV prevalence 16,19 .
This study has a number of limitations. The study is ecological in nature and is thus susceptible to the ecological inference fallacy. DHS surveys are not optimal for collecting sensitive sexual information 27,28 . The use of computer-assisted interviewing would be expected to have reduced but not eliminated this problem 27 . In addition, although the response rates for participation in the survey and HIV testing were high, this varied somewhat between states. The survey was designed to provide HIV prevalence estimates for 11 groups of states and not individual states. The data is thus susceptible to a large number of biases such as misclassification, nonresponse, recall and social desirability biases. In particular, other work has found evidence of culture-specific heterogeneity in answering questionnaires 29,30 . We cannot exclude the possibility that respondents from states where lower-risk sexual behavior was reported were subject to a greater social desirability bias, which could invalidate our findings. The available evidence, however, suggests that only minor differences in sexual behavior exist between those who do and do not answer sexual behavior questionnaires 31 . Finally, the analyses do not control for the influence of other variables.

Conclusion
We found a range of risk factors to be more prevalent in high-HIV-prevalence states in India. There was no clear single risk factor (or combination thereof) which appeared capable of explaining the heterogeneity of HIV spread in India. In the case of Mizoram and Nagaland, however, a higher prevalence of sexual risk behaviors may be contributing to higher HIV prevalence rates in this region. More detailed comparative studies between these populations and lower prevalence populations elsewhere in India may confirm or refute this finding.
Studies of other higher-HIV-prevalence-populations that have managed to reduce HIV incidence, including Uganda, Zimbabwe, Kenya and Thailand, have pointed to the importance of reductions in multiple partnering in effecting this decline [32][33][34][35] . If other studies confirm our findings, then similar campaigns could be considered in Mizoram and Nagaland.

Data availability
The NFHS-4 survey is freely available from www.measureDHS. com as part of the India: Standard DHS, 2015-16. Access to the dataset requires registration, and is granted to those that wish to use the data for legitimate research purposes. A guide for how to apply for dataset access is available at: https://dhsprogram.com/ data/Access-Instructions.cfm.

Grant information
The author(s) declared that no grants were involved in supporting this work.

2.
The data for the study was obtained from the NFHS-4 survey and in the methods section, where you discuss ethics, could you clarify that it is the survey that had ethical approval rather than this study.
I did have some difficulty following the geography outlined in the paper. In Figure 1, some of the circles for union territories appear not to be associated with any particular area (e.g. DN). There is double labelling for DD and PY and LD is not in the legend. In the text there was mention of 11 groups of states which should have appeared in Table 2 but there are only five states mentioned in this table. Table 4 might be  the more appropriate table. Table 2 and Table 3 are almost identical with only the last row different -you could probably get rid of one table by combining them.
In the discussion, whilst I accept that there is overlap of the circumcision rates between various states with high and low HIV prevalence I don't think there is enough data available in the study to support the contention that circumcision is not an important factor. I think the study supports high-risk sex and condom use as being the most important factors but in states where circumcision rates are low and HIV prevalence high -there should be some consideration of the role of circumcision in the public health response to HIV.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed.

Competing Interests:
Reviewer Expertise: I am an infectious diseases physician with interests in the diagnosis of infectious diseases and various aspects of HIV including the role circumcision in the prevention of HIV.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 12 Jun 2019 , University of Cape Town, Cape Town, South Africa

Chris Kenyon
The wording as to the ethical approval process has been amended as suggested.
The following text has been added to the methods section to assist the reader with understanding the difference in size between Union Territories and States: The states are typically large and vary in surface area from Rajastan's 342,239km to Goa's 2 2.

5.
Is the work clearly and accurately presented and does it cite the current literature?

Is the study design appropriate and is the work technically sound? Partly
Are sufficient details of methods and analysis provided to allow replication by others? Yes

If applicable, is the statistical analysis and its interpretation appropriate? No
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? No
No competing interests were disclosed. Competing Interests:

Reviewer Expertise: Immunology and Infectious Diseases
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however we have significant reservations, as outlined above.
Author Response 12 Jun 2019 , University of Cape Town, Cape Town, South Africa

Chris Kenyon
Thank you for this comment. We have provided more background information as the heterogeneity of peoples in India's states (Page 3): India is a federal union comprising 29 states and 7 union territories ( Figure 1). We refer to these 36 entities as states. The states are typically large and vary in surface area from Rajastan's 342,239km to Goa's 3,702km (median 88,752km ). The Union Territories are far smaller (median size 491km ) and for this reason are indicated with circles on the map in Figure 1. India's vast size incorporates a wide range of ethnic and liguistic groups. It has an estimated 2000 ethnic groups and classifies 23 languages as official . There are considerable differences between states in ethnic composition as well as in educational attainment, life expectancy and poverty rates . This information has been added: India's prior HIV prevalence survey (NFHS-3) in 2005-2006 was designed to provide HIV prevalence estimates from six states . Of these, Manipur had the highest HIV prevalence (1.13%), followed by Andra Pradesh (0.97%), Karnataka (0.69%), Maharashtra (0.62%), Tamil Nadu (0.34%) and Uttar Pradesh (0.07%) in 15-49 year old men and women. This has been made clear in the new version which now reads (Page 8): Controlling for education and poverty levels, the prevalence of higher-risk sex in men (r=0.40; P=0.017) and mean number of lifetime partners in men (r=0.55; P=0.001) were positively associated with HIV prevalence (Table 4).
The word 'likelihood' has been changed to 'probability.' The word 'likelihood' has been changed to 'probability.' Thank you for pointing this out. The five states that are classified as high HIV prevalence are now clearly listed in the title of Table 3. The list of all low HIV prevalence states would be too long to merit inclusion in the title, but the definition of low HIV prevalence state (HIV prevalence ≤0.2%) has been included in the title. Table 3. Comparison of prevalence of HIV-related risk factors in high HIV prevalence states (>1%; (Andhra Pradesh, Chandigarh, (Manipur, Mizoram and Nagaland) with low HIV prevalence states (≤0.2%). The Table contains prevalence figures of HIV-related risk factors and no coefficients. This has been made clear in the title and in the column headings.

No competing interests Competing Interests:
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias You can publish traditional articles, null/negative results, case reports, data notes and more The peer review process is transparent and collaborative Your article is indexed in PubMed after passing peer review Dedicated customer support at every stage For pre-submission enquiries, contact research@f1000.com