Keywords
wealth portfolio index, urban flood-prone regions, socio-economic position, nonlinear principal components analysis, categorical principal components analysis
This study aims to develop a short version of the wealth portfolio index by incorporating expenditure data, addressing the need for practical socioeconomic assessment tools in flood-prone regions of Indonesia. We followed the line of the national socioeconomic survey, which included comprehensive data on food and non-food expenditures, household assets, and housing characteristics.
To construct and validate the short-form measure, a random sample of residents (N=700) was collected from flood-prone areas in Bima, Manado, and Pontianak. The study employed nonlinear principal components analysis in the form of categorical principal component analysis (CATPCA), which is suitable for integrating variables measured on differing scales.
The CATPCA analysis supports a two-dimensional wealth portfolio index consisting of total 24 items, reflecting both asset ownership and expenditure patterns relevant to the target population.
The emergence of two distinct dimensions highlights the multidimensional nature of socioeconomic status in environments with recurrent flooding. Based on the current finding, we suggest to use the short-form index which provides robust item coverage while facilitating efficient household-level assessments in diverse Indonesian communities, especially in flood-prone areas.
wealth portfolio index, urban flood-prone regions, socio-economic position, nonlinear principal components analysis, categorical principal components analysis
The accurate assessment of household socio-economic status (SES) is imperative for impactful social sciences research, effective policy formulation, and the targeted delivery of interventions. This is very much relevant for developing countries where resource allocation is critical and vulnerability is pronounced. For populations residing in challenging environments, such as urban flood-prone regions, nuanced SES measures are especially crucial for understanding heterogeneity in living conditions, identifying drivers of deprivation, and designing appropriate support mechanisms.
Wealth index has long been used to assess socio-economic position of a household based on household asset data and housing characteristics.1,2 Its popularity is inseparable from its shorter scale compared to expenditure data and its accuracy in explaining the variation in education, health and overview of financial condition of households.3 A shorter scale means easier computation and, most importantly, less fatigue among participants resulting in better response rates.4 Consequently, wealth index is often preferable than expenditure and income data in household surveys, especially in developing countries.
Expenditure data has become less favourable in household surveys due to its long tedious measurement. It is, thus, considered being prone to high non-response rates.1 However, while asset indices capture an important dimension of SES, their focus on accumulated stock can blur critical aspects of household economic life related to current consumption flows. Expenditure data, though more cumbersome to collect, provides invaluable insights into immediate living standards, food security, dietary diversity, and access to essential services, for instance in the National Socioeconomic Survey in Indonesia (or abbreviated as SUSENAS in Indonesian). Food consumption patterns are a vital input for formulating precise health and nutrition policies, offering a complementary perspective to housing characteristics and asset ownership. SUSENAS itself is conducted twice every year in domain of consumption/expenditure,5,6 and once every three years in sequence for health and housing module, social security module, and social, cultural, and education module.
This methodological trade-off—between the practicality of asset indices and the informational richness of expenditure data—highlights a significant gap in SES measurement. There is a compelling need for an approach that can holistically capture both the long-term dimension of accumulated household wealth and the more immediate realities of consumption and expenditure, without imposing burdens on survey participants or research logistics. Such an integrated measure, which we term a “wealth portfolio index,” aims to provide a more comprehensive and dynamic understanding of a household’s livelihood and socio-economic position. This is particularly important for flood-prone dwellers, whose economic lives often involve a complex interplay of limited assets and fluctuating consumption capabilities.
In the development of a robust wealth portfolio index, the most commonly practice is to use principal components analysis (PCA) method. The statistical procedure aims to reduce as many variables/items as possible from a large collection of variables and cluster them in a minimum number of dimensions.7 Based on its main use, PCA is often considered a dimension or data reduction method. It calculates all of the variance of the items (or manifest variables) and configures their relation to the factor(s) based on that variance calculation.8 This makes PCA a suitable option when working with a large set of expenditure items, household asset data and housing characteristics (see further for wealth index creation using PCA in1,2).
Since this study aims to combine different socio-economic measures involving various measurement scales, PCA is not the appropriate option. Looking at the different dimensions included in the wealth portfolio index proposed, we note that expenditure measures are often assessed in the nation’s currency (Rupiah), which is a ratio scale, similar to household data assets, while housing characteristics are often measured in nominal and ordinal scales. To address this methodological challenge, we employ nonlinear principal components analysis (NLPCA), specifically the Categorical Principal Components Analysis (CATPCA). The analysis is particularly designed to accommodate analysis containing variables with different measurement scales (i.e., nominal, ordinal and numeric) that might also be nonlinearly correlated to each other.9,10 CATPCA itself has the same goal of PCA, that is reducing a large pool of variables into a smaller number of principal components by mapping out the relationship structures between observed variables and find a minimum number of linear combinations that explain the largest variance in the data.
The results of CATPCA is analogous to PCA.9 Firstly, it provides Eigenvalues which shows the variance accounted for (VAF) by each principal component generated. As stated by Sartipi et al.10 “An Eigenvalue is a ratio of the variance of all variables determined by the relevant dimension”. In other words, the value indicates how much variance each dimension can explain in the total variance of all observed variables. Secondly, the output shows component loadings indicate the correlations between the observed variables and the principal components. As a general rule, a higher score indicates a higher correlation. Thirdly, it provides communality values or the sums of squared component loadings over components for each variable. This value indicates how much variance the observed variables contribute to the total VAF. To this end, we run CATPCA in SPSS program version 27.
To serve the purpose of this study, we use a dataset of the RISE project which focused on the relation between water management, wellbeing and resilience among the general population living in flood-prone areas in Indonesia, namely in Manado, Bima and Pontianak. The dataset provides two advantages to the study. One, Indonesia is considered a developing country that shares similar economic growth with other nations such as, Vietnam, Philippines, and Thailand.11 Therefore, the wealth index measure can be used or modified in those similar countries. Two, the dataset was gathered by relying on random samples of the general population. This greatly reduces selection biases and increases the validity of the measures. Among many other measures, the survey involved consumption/expenditure domain from SUSENAS, household asset data and housing characteristics. For the purpose of this study, we will use these measures in creating the wealth portfolio index.
By creating a short version of wealth portfolio index that consists of household assets, housing characteristics and expenditure data, we believe that this study provides, at least, two academic contributions. First, it proposes an empirically derived and effective tool for assessing a household’s socio-economic position that moves beyond traditional asset-only indices by strategically incorporating expenditure information.1–3 This enriched measure has the potential to capture a more complete picture of household livelihood, which is particularly critical for understanding and addressing health and nutritional outcomes where food consumption plays a direct role. Filmer and Pritchett3 have proposed the use of wealth index in cases of missing expenditure data, such as in Demographic Health Survey (DHS), to explain educational attainment. This works well because school opportunities and risks can be directly observed through the ownership of assets, e.g., motorcycle, and housing characteristics, e.g., water piped. However, in terms of measuring health outcomes, food consumption also plays an important role in explaining health behaviours. Thus, having expenditure data allows researchers to target a more accurate health policy. Based on this, the study contributes to the use of wealth portfolio index, as part of the larger set of measures, in capturing a household’s livelihood.12
Second, this study demonstrates the utility of NLPCA in constructing such an index by systematically reducing a broad option of items into a concise set of key indicators, therefore addressing the practical challenges typically associated with collecting detailed expenditure data.9 The proposed short version of wealth portfolio index provides an efficient way to measure expenditure information and still maintains the accuracy. Similar to the idea of selecting only the most important items of household assets (e.g.,1), based on the analysis, we will only include expenditure items that are most important for the wealth portfolio index measure.
In a nutshell, we argue that it is possible to include expenditure data in the wealth index measure as an attempt to assess socio-economic position of a household. By employing NLPCA, we can include all items on household assets, housing characteristics and expenditure in one model and configure their factor and component loadings.
We used the RISE survey dataset focused on the relation between water management and wellbeing and resilience among flood-prone dwellers in Manado, Bima and Pontianak, Indonesia. The survey was conducted from November 2021 until February 2022. The survey aimed to collect random samples from the general population aged 18 years above and has lived in the area for at least 3 years.13 The research permit for data collection was granted from the Directorate General of Politics and General Administration from The Ministry of Home Affairs of the Republic of Indonesia in 2021 (470.02/7428/Polpum) and the study’s ethical clearance was reviewed by the Research Ethical Committee of Universitas Indonesia (011/FPsi.Komite Etik/PDP.04.00/2022).
Specifically, the random sampling of participants was done using random walk with a two-house interval. This was done due to the limited access to the population registry in each location. Prior to taking part in the survey, each participant was informed about the purpose and procedures of the study and provided written informed consent to participate voluntarily. The survey successfully gathered 700 participants: 200 in Manado, 200 in Bima and 300 in Pontianak. The dataset, along with a documentation of the data collection process, is freely available on the Dutch Archiving Network System (DANS) platform.
Household assets
There are 7 items measuring household assets in the original scale. The items were adopted from the SUSENAS and modified by Lembaga Demografi (LD) in Indonesia that helped carry out the survey in 2021. These items cover a vast range of assets, including land and transportation assets. Examples of the items are “How many bicycle does this household have?”, “How many televisions does this household have?” and “How many motorcycle does this household have?”.
Housing characteristics
The housing characteristics measure included in the RISE dataset covers the most-frequent questions used in previous wealth index measures. Items like housing status, land status, roof condition and wall condition were asked in the survey. In addition, the measure also asks main sources of drinking water, means of getting drinking water and main source of lighting. In total, there are 10 items of housing characteristics in the original scale.
Expenditure data
For expenditure data, the survey involved two types of consumption. One, weekly food consumption and two, monthly non-food expense. For the former, participants were asked a collection of questions regarding their weekly consumption of grains, fish, meat, egg, milk, vegetables and many other dietary requirements. As for the latter, participants were asked a group of questions regarding their monthly expense on house rent (if renting), electricity, transportation, gas, water, communication, health insurance and many other important spendings. In total, there are 27 items in the original scale, with 14 items in the food consumption and 13 items in the non-food expense.
In running the CATPCA, we used the following parameters to determine a good-fit output. First, we examined the outliers. Prior to running CATPCA, we looked at the object scores to investigate whether there were objects that lie in a substantial distance from the other object scores in the principal component. Linting and Van Der Kooij9 suggested that scores between -3.5 and 3.5 are considered outliers. Second, variable selection was based on the score of communalities of each variable (VAF). Variables with a total VAF of minimum .25 are selected for further analysis.9 Third, we examined variables’ factor loading. As with PCA and factor analysis, variables with factor loading of minimum .4 are considered important.14 Finally, we also assessed the consistency of the dimensions through Cronbach’s alpha. A dimension with a value greater than 0.7 suggests an acceptable internal consistency.10
We used SPSS 27 to perform CATPCA. Since we included three different dimensions, namely consumption/expenditure, assets, and housing characteristics, we opted for a 3-dimension scale. Therefore, we included all variables from each dimension and determined three dimensions to be generated. Initially, we started with a total of 44 variables: 14 for food consumption, 13 for non-food expense, 7 for household assets and 10 for housing characteristics. We selected orthogonal (Varimax) rotation method for each analysis.
Looking at the eigenvalues only, the 3-dimension output seemed to show positive results. However, when investigating the item-dimension configuration we noticed that there were many variables with high factor loadings loaded in irrelevant dimension. For instance, several variables on food consumption were loaded in housing characteristics dimension. Even after removing variables with low factor loadings and low VAF, the output showed the same pattern.
To overcome this, we ran another test but this time determining a 2-dimension scale. We have also removed low factor loading variables, 3 from food consumption (e.g., grains and tobaccos removed), 7 from non-food expenditure (e.g., house rent, and house maintenance removed), 3 from household assets (e.g., bicycle and television removed) and 7 from housing characteristics (e.g., type of housing floor and wall removed). This time the output showed better results (see Table 1). All weekly food consumption variables are clustered in the same dimension, along with clothing and insurance expenditure, and sewage system. This dimension accounts for more than 25% of variance in the data (with 6.01 eigenvalues). Subsequently, monthly non-food expense variables, e.g., electricity, gas and water, and communication expenditure, are pooled in the second dimension along with household assets. The second-dimension accounts for more than 17% of the variance in the data (with 4.16 eigenvalues). In total, there are 14 items in the first dimension (named food consumption, clothing and insurance, and sewage system) and 10 items in the second dimension (named non-food expense, assets, and other housing characteristics.
Based on these, we can conclude that the wealth portfolio index measure is better suited with two dimensions, and that food consumption and non-food expense are clustered into two different categories. These two, along with specific housing characteristics and household assets, hold great importance in determining a socio-economic position of a household: Explaining socio-economic position of households, with about 42% variance accounted for by the two dimensions. This is considered a reasonable fit for a scale.9 In terms of reliability, the two dimensions are also shown to have acceptable level (α = .89 for dimension one and α = .74 for dimension two).
To demonstrate the utility of wealth portfolio index, we ran correlational tests using other constructs in the RISE dataset. Table 2 shows the predictive validity of the proposed wealth portfolio index.
Finally, Figure 1 shows the rotated component loadings plots, indicating the relations between variables in the two dimensions. In Figure 1, we can see that the level of wall condition is closely related to the level of roof condition. Similarly, motorcycle is positively related to most weekly food consumption variables: Participants who have motorcycle, at least one, tend to have higher weekly food consumptions.
In this study, we proposed a short version of wealth portfolio index measure which also involves consumption/expenditure data that has long been neglected due to its lengthy nature.2 By combining consumption/expenditure, household, and housing characteristics data, we aimed to reduce the number of items by running CATCPA. The findings show that the short version is suited as a two-dimension measure, and it explains as much as 42% of variance that exists in the RISE dataset.
Specifically, the index contains two important dimensions. One is labelled “Food consumption, clothing and insurance, and sewage system”. The other is labelled “Non-food consumption, assets, and housing characteristics”. The former mainly includes important weekly food consumption items, such as fish, meat, egg and milk. With the addition of clothing and insurance monthly expense and one particular housing characteristic, namely sewage system, this dimension is found to be the most important dimension in explaining socio-economic position of households, accounting for more than 25% of the variance. The latter dimension focuses on monthly non-food consumption, such as electricity and gas, assets, such as motorcycle and refrigerator, and other housing characteristics, that is the wall and roof condition. This dimension is a proper supplement to the former dimension because assets and such housing characteristics have been shown to also hold a key role in determining socio-economic position.15
The wealth portfolio index developed in this study addresses a critical need for measures that integrate both accumulated assets and current consumption patterns. Traditional asset-only indices often overlook the immediate vulnerabilities captured by expenditure data, particularly concerning food and health, while comprehensive expenditure surveys are frequently impractical. The 24-item index offers a balanced, empirically derived solution. The first dimension contains 11 weekly food consumption items, 2 non-food expense and 1 housing characteristic item. This shows that weekly food consumption is of great importance in the measure, which explains much of the variance by the first dimension. The finding is in line with previous findings in Indonesia regarding dietary intake and diversity among the general population. Food consumption such as fish, meat, nuts, egg and milk are still worth discussing in determining socio-economic status among people living in Indonesia, mainly due to the fact that the country still records a high number of malnutrition among pregnant women and stunting in the population.16,17 Therefore, although household assets and housing characteristics are equally important, food consumption in Indonesia still holds great importance in predicting socio-economic position of a household.
By successfully including a great range of dimensions in the measure, the finding shows that the short version of wealth portfolio index measure is able to portray a more comprehensive outlook of a household. Relying solely on household assets, e.g., motorcycle, and housing characteristics, e.g., sewage system, will only explain half of the health risk experienced by people, especially living in flood-prone regions. However, with the inclusion of food consumption, we are also able to provide the other half of the health risk which enables us to investigate more accurate research on the relation between wellbeing and resilience as well as its intervention.
In addition, we argue that CATPCA is best suited in creating a wealth portfolio index involving various dimensions that are on different measurement scales. For instance, housing characteristic items were coded as 0 (poor quality) and 1 (good quality), whereas food consumption was estimated on Rupiah currency, which is on a ratio scale. By employing CATPCA, we were able to properly carry out a nonlinear component analysis by simultaneously including all of the items with different measurement scales.9 By using CATPCA, we were able to create a short version of 24-item wealth portfolio index which contains the most relevant food consumption, non-food expenditure, household assets and housing characteristics variables.
In conclusion, the novelty of this research is threefold. First, the conceptual integration of a ‘wealth portfolio’ including expenditure, providing a more comprehensive SES lens for flood-prone populations, particularly urban flood-prone regions. Second, the methodological application of CATPCA to create a psychometrically sound short-form index from diverse measurement scales. Finally, third, the alternative of a contextually relevant tool for assessing SES in Indonesian urban informal settlements. The distinct “Food consumption, clothing and insurance, and sewage system” dimension, which accounts for over 25% of the variance in the data, can be directly utilized by local governments and NGOs to identify households at high risk to food insecurity or poor sanitation, thereby enabling more targeted interventions. Similarly, the “Non-food expense, assets, and other housing characteristics” dimension can assist in programs aimed at improving housing quality (e.g., wall condition and roof condition) or enhancing access to essential modern amenities such as electricity, gas, water, and communication technologies to improve their livelihoods. The inclusion of items like ‘motorcycle’ and ‘refrigerator’ also provides proxies for economic capacity and access to resources critical for livelihood resilience in these communities.
This study is not without limitations. First, we acknowledge that our samples involved people living in flood-prone areas in Indonesia. By definition, the residences are mostly characterized by poor quality of a building, lack of public facility and dense population.18,19 Therefore, perceptions towards the importance of certain types of food groups and household assets may be slightly different from those living in a better condition, e.g., private residential complex. Second, the RISE survey was conducted from November, 2021 to January, 2022, which was during the Covid-19 pandemic. The living conditions of many people might have been heavily affected by the pandemic, which in turn, impacting their food consumption, non-food expense and even the maintenance of their housing quality (see20 for how the pandemic impacts Indonesian fanilies). Therefore, the survey results may not completely reflect the general situation of the participants. However, we believe that the dataset still offers the best approximation of the current situation of the general population, considering that the high importance of food consumption is highly relevant in the extant condition of malnutrition and stunting cases in Indonesia.17
Altogether, by having the two dimensions in such a short scale, the wealth portfolio index provides an effective way to investigate a more comprehensive socio-economic position of households without having to sacrifice the reliability of the data. Scholars can now involve expenditure data without having to fear of high non-response rates, because the index contains only the most important expenditure variables. By including food consumption, researchers are able to conduct a cross-comparison study in terms of health risk, educational outcomes, and their relation to socio-economic position of a household (for instance,21). Overall, we believe that the 2-dimension wealth portfolio index is effective in portraying a living standard of households, especially among people living in flood-prone areas. To extend the measure’s generalization, future studies are encouraged to test this measure among people living in a better living condition and in different parts of the country.
The dataset generated during and/or analysed during the current study are available freely along with its documentation in the online archiving system repository: https://doi.org/10.17026/dans-z5q-d3ae.22
Extended Data: Survey questionnaire and its code book can be found freely in the same online archiving system repository as the dataset: https://doi.org/10.17026/dans-z5q-d3ae.22
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).
We thank all participants in taking part in the RISE survey and all RISE project team members who allowed us to have fruitful discussions in every meeting.
| Views | Downloads | |
|---|---|---|
| F1000Research | - | - |
|
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)