Towards successful aging classification using machine learning algorithms

Jesuloluwa Zaccheus; Victoria Atogwe; Ayodele Oyejide; Ayodeji Olalekan Salau

doi:10.12688/f1000research.138608.1

Home Browse Towards successful aging classification using machine learning algorithms

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Towards successful aging classification using machine learning algorithms

[version 1; peer review: 2 approved with reservations, 1 not approved]

Jesuloluwa Zaccheus¹, Victoria Atogwe¹, Ayodele Oyejide¹, Ayodeji Olalekan Salau ^2,3

PUBLISHED 25 Sep 2023

Author details Author details

¹ Biomedical Engineering, Afe Babalola University, Ado Ekiti, Ekiti, 23401, Nigeria
² Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Tamil Nadu, Chennai, India
³ Department of Electrical/Electronics and Computer Engineering, Afe Babalola University, Ado Ekiti, Ekiti, Nigeria

Jesuloluwa Zaccheus
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation

Victoria Atogwe
Roles: Data Curation, Investigation, Validation, Writing – Original Draft Preparation

Ayodele Oyejide
Roles: Formal Analysis, Methodology, Software, Validation

Ayodeji Olalekan Salau
Roles: Conceptualization, Data Curation, Investigation, Methodology, Supervision, Validation, Visualization, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background: Aging is a significant risk factor for a majority of chronic diseases and impairments. Increased medical costs brought about by the increasing aging population in the world increases the strain on families and communities. A positive and qualitative perspective on aging is successful aging (SA). Successful aging refers to the state of being free from diseases or impairments that hinder normal functioning, as observed from a biological perspective. This differs from typical aging, which is associated with a gradual decrease in both physical and cognitive capacities as individuals grow older.
Methods: In this study, the geriatric data acquired from the Afe Babalola University Multi-System Hospital, Ado-Ekiti was initially prepared, and three fundamental machine learning (ML) techniques such as artificial neural networks, support vector machines, and Naive Bayes—were then constructed using the data from a sample of 2000 individuals. The Rowe and Kahn Model determined that the dataset was SA based on factors such as the absence of fewer than or equivalent to two diseases, quality of life, nutrition, and capacity for everyday activities.
Results: According to the experimental findings, the predictive network Artificial Neural Network (ANN) performed better than other models in predicting SA with 100% accuracy, 100% sensitivity, and 100% precision.
Conclusions: The results show that ML techniques are useful in assisting social and health policymakers in their decisions on SA. The presented ANN-based method surpasses the other ML models when it comes to classifying people into SA and non-SA categories.

Keywords

Quality of Life, Aging, Machine Learning, ANN, Population

Corresponding author: Ayodeji Olalekan Salau

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2023 Zaccheus J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Zaccheus J, Atogwe V, Oyejide A and Salau AO. Towards successful aging classification using machine learning algorithms [version 1; peer review: 2 approved with reservations, 1 not approved]. F1000Research 2023, 12:1201 (https://doi.org/10.12688/f1000research.138608.1) First published: 25 Sep 2023, 12:1201 (https://doi.org/10.12688/f1000research.138608.1) Latest published: 03 Apr 2024, 12:1201 (https://doi.org/10.12688/f1000research.138608.2)

Introduction

The World Health Organization (WHO) projects that by 2050, there will be nearly 1.6 billion older people in the world, or roughly 16% of the total population.¹ We are currently experiencing significant sociodemographic and lifestyle changes, particularly in industrialized nations, where we are witnessing the shift from an aging to a super-aging population.²^,³ Nigeria, the largest nation in Africa with a leading economy, has the 19th-highest percentage of the world’s elderly population, and it is expected that this percentage would almost treble over the next two decades.⁴ However, the rise of older Nigerians is taking place against a backdrop of utter poverty, unresolved development issues, socioeconomic disparity, and a loss in the traditional support and care of senior citizens.

Human life expectancy has increased globally due to advancements in medicine and social science, leading to the aging of populations becoming a problem for all nations.⁵^,⁶ While the extension of life expectancy is a significant scientific achievement, it has also resulted in increased expenses for social welfare and care for the elderly.⁷ This demographic shift has not only influenced disease patterns and increased chronic illnesses worldwide but has also posed socioeconomic challenges for governments and families.⁸^–¹⁰ In light of these developments, it is essential to consider the quality of life of the elderly and their preferences during this stage of life. Although everyone desires to live longer, it is more important for both individuals and society to focus on enhancing quality of life and reducing the burden of diseases in old age.¹¹

To address the challenges associated with population aging, the concept of successful aging (SA) has emerged. SA recognizes that the aging process is unique to each individual and encompasses various aspects across disciplines.¹²^,¹³ Although there is no formal definition for SA, it is widely accepted that it involves being free from chronic illnesses and having healthy physical and mental functioning.¹⁴^,¹⁵ Rowe and Kahn proposed an operational hypothesis for SA, which consists of three components: active engagement in life, absence of illness or impairment, and optimal physical and cognitive functioning.¹⁶^,¹⁷ This hypothesis is widely recognized in academic circles and emphasizes how elderly individuals adapt to the physical, spiritual, and social changes brought about by aging.¹⁸^,¹⁹

The concept of SA has evolved from a single-dimensional focus on the presence of disease or functional decline to a multidimensional perspective aligned with the World Health Organization’s definition of health, encompassing physical, mental, social, and spiritual well-being.²⁰ However, defining this complex and multidimensional phenomenon has proven challenging due to inherent ambiguity.²¹

The fact that non-genetic factors have a considerable impact on aging in addition to genetic influences is noteworthy.¹⁵ There haven’t been many long-term studies on SA, although prior research has usually concentrated on factors that affect SA.²²^,²³ Because of the co-dependence and complexity of the factors impacting SA, traditional statistical models are inappropriate.²⁴ Machine learning (ML) methods have been increasingly important in recent years for handling challenging, multidimensional, and nonlinear issues.²⁵ As a result, it is possible to develop an intelligent model to forecast whether SA will exist or not.

The use of machine learning to forecast and identify social aspects of aging has been studied in some detail. For instance, the authors in Ref. 26 used a sample of 983 to train five fundamental ML models (ANN, DT, SVM, NB, and K-NN) using one ensemble technique. The outcome of the prediction was achieved by implementing a method known as majority voting, which relies on the collective decision of the developed base models. The authors attained 93% accuracy, 92% specificity, and 87% sensitivity. Authors in Ref. 27 created questionnaires and fitness tests to gather the necessary information from the elderly population. The models used were gradient boosting decision trees, random forests, deep learning, and logistic regression. In a study involving 890 samples, a deep learning model demonstrated superior performance compared to other models. The deep learning model achieved an accuracy of 89.3%, a positive predictive value of 85.8%, and a specificity of 93.1%. They came to the conclusion that the deep learning model is excellent for SA maintenance prediction. The use of machine learning approaches to successfully predict aging in the elderly was discussed in Ref. 28. For the analysis, the researchers looked at the SA and non-SA data of 975 elderly persons. The Chi-square test at P > 0.05 was used to determine the factors that had the greatest impact on the SA. In this study, several algorithms such as Adaptive Boost, Random Forest, Artificial Neural Network, Support Vector Machine, and Naive Bayes were employed to develop prediction models. The performance of these models was evaluated using various metrics. The sensitivity, which measures the ability to correctly identify positive cases, was found to be 91%. The specificity, indicating the ability to correctly identify negative cases, was determined to be 98%. The overall accuracy of the models was 95%. The F-test, which assesses the model’s overall performance, yielded a value of 90%. Additionally, the area under the curve (AUC) test, which measures the model’s ability to distinguish between positive and negative cases, resulted in a score of 88.4%. Based on these evaluations, it was concluded that the Random Forest algorithm exhibited the best performance in predicting SA in elderly individuals (presumably referring to a specific condition or event). The previous works, however, have some limitations, including: insufficient data for training the models, class imbalance in the dataset, technique used to replace missing values in the dataset, use of training ratios and hyper-parameter tuning to improve the model’s accuracy, and selection of significant aging-related features in the dataset. In order to get the optimal form of the raw data from the hospital’s electronic medical record, this study performed analysis on enough data while utilizing data preparation techniques and consultations from gerontologists. To increase the suggested system’s predictability, effective feature selection and dropout were also coupled, and classification was carried out using ANN. Therefore, in this study, three key ML models for SA prediction were created, described and evaluated. The main objective was to develop SA prediction models using geriatric data while taking into account sociodemographic, clinical, and lifestyle characteristics in the dataset, which are crucial factors for early SA prediction. In addition, the SA prediction models were used to further investigate key determinants ascertaining progression of the aging condition. The subsequent sections present and discuss the proposed SA prediction system’s development process, the results and conclusion.

Methodology

System architecture

Figure 1 depicts the main steps of the proposed system architecture. Data pre-processing, feature selection, model construction, performance measurements, and successfully classifying aging are among the phases. The original dataset, which is unstructured and not beneficial for the design of the model, is created during the data pre-processing stage by sorting out variables from the electronic medical records of elderly patients. Additionally, feature selection was utilized to separate the features that are redundant from the features that support accurate prediction in the system and to extract the significant variables that are relevant to successful aging. Thirdly, Support Vector Machine, Naive Bayes, and Artificial Neural Network were among the prediction models developed for this study. During the developmental process, particular hyperparameters for the SVM and NB models were carefully selected. Thereafter, the preprocessed data, was inputted into the model, using it to train the classification model. A better learning pattern for predicting successful aging is provided by the ANN model in combination with a dropout strategy. The combined technique offers increased classification accuracy, specificity, and sensitivity when compared to the other models. ReLu and sigmoid activation functions are used in the network’s hidden and output layers during the training and testing phases, respectively. Initially, three core classification algorithms, namely artificial neural network (ANN), support vector machine (SVM), and naive Bayes (NB) models, underwent training with the purpose of identifying whether an individual possessed the status of SA or non-SA. Then, the best hyperparameters and training ratios are used to increase the models’ predictive accuracy.

Figure 1. Proposed system architecture.

Study parameters

The present study is retrospective and concentrates on sociodemographic, clinical, behavioral, and psychological parameters for the assessment of specific patients by accessing the medical records of older (> 60 years old) patients.²⁹ There are multiple variables based on the preliminary data gathered. The machine learning model’s performance was validated while the most useful features from the gathered data were taken out and used as input parameters. Additionally, the factors linked to successful aging were determined through interactions with gerontology experts and analyses of relevant literature. The description of the output class variables (i.e., SA & Non-SA), are as follow: age, sex, reading proficiency, marital status, occupation, and income level are the seven parameters considered in this category of socio-demographic factors.

The clinical factors involve diseases including hypertension, heart disease, kidney, liver, bone, and muscular disorders as well as depression, eye and eyelid disorders, diabetes, and cancer.

The ability to carry out daily living activities (ADLs), life satisfaction, quality of life (QOL), a healthy lifestyle, interpersonal relationships, nutrition, physical activity, illness prevention strategies, and stress management are all included in the category of behavioral and psychosocial factors. The sociodemographic and clinical information was taken from medical records of elderly persons.

The outcome variable was split into SA-related (coded 1) and non-SA-related (coded 0) classes. Rowe and Khan’s approach,¹⁶ which consists of three fundamental elements such as maintaining good mental and physical function, ongoing involvement in life, and lack of disease and disease-related disability was used in this study to quantify SA.

Data collection and pre-processing

For this study, 2000 older persons were included in the dataset that was taken from the Afe Babalola University Multi-System Hospital’s electronic medical record between January 2019 and April 2023 in Nigeria. The ethics research committee of the hospital granted access to conduct the study (Approval Number: AMSH/REC/AVA/133) with requirements of complying with all international guidelines and regulations. The data collected for this research was retrospective and did not involve interviews with participants. All data used in this research were anonymised and do not reveal the patient’s identity in any form. Several techniques for data pre-processing were employed to generate optimal models. The dataset employed in this study contained certain missing values. Excluding these instances from the dataset would diminish the overall quality of the data, as they could potentially possess vital information that could influence the accuracy of predictions. However, a range of methods exist to address the issue of missing values and rectify the dataset, one of which was employed to address the issue. The gaps created by the missing values were filled in using the mean value of the corresponding feature in the data set. Another problem with the data that had been acquired was data that was out of balance. An uneven data class distribution is when one class has a disproportionately smaller number of samples than the other. This decreases the effectiveness of ML algorithms.³⁰ The dataset includes 2000 records of both successfully aged and unsuccessfully aged persons after pre-processing and balancing. Based on the absence of fewer than or equivalent to two diseases, quality of life, nutrition, and capacity for daily activities, the data was categorized as SA. Following classification of the full dataset, there were 1100 occurrences of SA and 900 instances of Non-SA.

Feature selection

The feature selection strategy was used to reduce the dataset dimension and enhance ML performance. Feature selection is a crucial technique in data mining that plays a significant role in eliminating redundant and irrelevant features. It holds immense importance as it helps filter out duplicate and unrelated features from the dataset. Through feature selection, statistical techniques are utilized to reduce the dataset dimension. In a summary, this method’s advantages include better comprehension, avoiding algorithm overfitting, boosting processing power, and enhancing mining performance.³¹^,³² With the help of gerontologists and the identification of pertinent data that is relevant to the results of successful aging in the aged group, features for our model were sorted based on Rowe and Khan’s model for successful aging as well as their model for successful aging.

Age, hypertension/CVD, renal illness, liver disease, neuromuscular disease, depression, eye disease, diabetes, cancer, ADLs, Nutrition Status, and quality of life (QOL) were the determinant variables that were most pertinent to SA. These characteristics were therefore thought to be the most important ones in defining SA in elderly people. The dataset has information like marital status, occupation, educational level, gender, and hospital department removed from it. This was done to increase the precision, sensitivity, and accuracy of all produced models. Figure 2 shows the correlation between selected features in the extracted dataset, and a maximum correlation is observed between renal disease and the ability to carry out daily activities.

Figure 2. Correlation between features of the dataset.

Figure 3 denotes the variables in the dataset including the feature name, feature type, and concept codes. In the dataset, NMD represents all categories of neuromuscular diseases, while QOL refers to the quality of life of each individual, and lastly ADLs represents the ability of aged people to carry out daily activities regularly.

Figure 3. Cross-section of the input dataset.

Development of classification models

For SA prediction, three supervised learning approaches were used. The elderly individuals were classified as either SAs or non-SAs using the ANN, Naïve Bayes, and Support Vector Machine algorithms.

ANN: An artificial neural network (ANN) is a technique in machine learning that emulates the natural information processing mechanisms of the human brain. Numerous processing units (neurons) make up the neural network’s structure, and they communicate with one another via weights. A nonlinear mechanism in the neural network enables parallel processing, learning, and decision-making. An ANN modifies its weighted connections by using a variety of learning cases. One of the process’s outcomes is to modify the network’s settings so that it can be retrained in a different environment.³³

Naïve Bayes: This algorithm’s characteristics are assumed to be unrelated to one another. It is the Bayesian theorem generalized. NB calculates the likelihood that a data sample belongs to a specific class as part of its process because it is a probabilistic model.³⁴^,³⁵ NB is another name for independent or simple Bayes. The development of this technique is straightforward and does not necessitate difficult initial parameter estimates. As a result, it offers tremendous accuracy and speed when handling large datasets and can be applied to enormous amounts of data. However, this technique has problems with conditional class independence and access to data probabilities.³⁶^,³⁷

Support Vector Machine: According to Zach,³⁸ the Support Vector Machine (SVM) is a supervised learning technique utilized for classification and regression tasks. Its primary purpose is to identify the most effective classifier that can divide a given dataset into two distinct classes. When dealing with datasets that can be linearly separated, the SVM employs a linear function to determine a hyperplane that passes through the middle of the two classes. Notably, there exist multiple potential hyperplanes for separation in such cases. However, the SVM ensures the identification of an optimal function by maximizing the margin between the two classes. The margin, as described in Ref. 39, refers to the degree of separation between the classes, which is represented by the hyperplanes. In the field of machine learning, hyperparameters refer to the parameters used to regulate the learning process. They are predetermined values that are set before the model begins learning, with the aim of enhancing the learning outcomes. It is worth noting that not all hyperparameters carry equal significance. The effectiveness of the ML algorithm is more significantly impacted by some hyper-parameters than others. Table 1 lists the hyperparameters that we employed in our models.

Table 1. Model hyperparameters.

ML Algorithm	Hyperparameters
ANN	number of hidden layers = 2, optimizer = adam, dropout = 0.2
Naïve Bayes	classifier = GaussianNB
Support Vector Machine	kernel = rbf, random_state = 0

Design of classification models

The training and hyperparameters tuning processes are part of the classifier design process. The Support Vector Machine, Naive Bayes, and Artificial Neural Network are the classification models investigated. For all generated models, training was typically done on between 50% and 70% of the dataset. Table 1 lists the optimized hyperparameters used in various iterations of the Naive Bayes and Support Vector Machine models.

Initially, epochs of 100 and 250 were utilized to train the ANN model; however, later in the experiment, a batch size of 32 and 500 epochs was used for adequate training. In order to find the best model configuration, we initially trained our models on a variety of train/test ratios (40/60, 50/50, 70/30, and 60/40). The 70/30 and 50/50 configurations produced the best accuracy with the least amount of over-fitting, and were thus presented in this study as the ideal model. The Adam optimizer was used to train the model, and network hyper-parameters like the number of hidden layers and activation function were modified as well. The ReLu activation was then used in all hidden layers, while the output layer for classification made use of a sigmoid activation function. To reduce over-fitting and improve the model’s accuracy, the dropout technique⁴⁰ was applied. The proposed framework was created in Python using the Anaconda IDE and built-in modules including Scikit-Learn, Keras, Tensorflow, Numpy, and Matplotlib. All experimentation was carried out on a Dell Precision WorkStation Computer with 8GB RAM, 1TB HDD and a 3.3GHz Core i5 processor. The test dataset was used to assess the prediction performance of our model, and 20% of the test dataset was used for validation.

Evaluation of the machine learning models

In this investigation, cross-validation was not employed. Since the dataset contains sufficient samples for both training and testing, the holdout validation technique was employed instead of the cross-validation method. Following the model’s training, the performance of the SVM, NB, and ANN models was assessed using the testing dataset. The most used metrics, including accuracy, precision, sensitivity, specificity, and F1 score, were utilized to assess performance. Accuracy is an indicator of the model’s classification or validation (training) accuracy. The number of true positives, true negatives, false positives, and false negatives can be estimated with the aid of a confusion matrix (Figure 4), which further assisted in evaluating the effectiveness of the suggested model. Table 2 provides the formulas needed to calculate the aforementioned measures.

Figure 4. Confusion matrix.

Table 2. Performance metrics formula.

Performance metrics	Formulas
Accuracy	$\frac{TP + TN}{TP + TN + FP + FN}$
Precision	$\frac{TP}{TP + FP}$
Specificity	$\frac{TN}{TN + FP}$
Sensitivity	$\frac{TP}{TP + FN}$
F1 Score	$2 \times \frac{Precision \times Sensitivity}{Precision + Sensitivity}$

Accuracy: This measures the proportion of accurate predictions to all other predictions.

Precision: a metric employed to assess the accuracy of a model in predicting positive outcomes. It measures the number of cases in which the model correctly projected a positive outcome out of all the instances it predicted as positive. It simply quantifies the ratio of true positive predictions to all the positive predictions made by the model.

Sensitivity: It is a measurement of how accurately the model has identified the good examples. It is described as the proportion of genuine positive predictions to actual positives.

Specificity: It is used to gauge how many true negatives the model accurately detected.

F1 score: It can be interpreted as the weighted harmonic mean of the precision and sensitivity.

The dataset is divided into four main groups in each of the experiments conducted in this work in order to employ the equations in Table 2: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).

Results and discussion

This section presents the findings that were drawn from the study. First, the findings from all experiments conducted on 50% of training are reported. The findings from experiments conducted on 70% of training are shown in the second section. The classification outcomes based on the actual information available are displayed in the confusion matrices. Each sample can belong to either of the two classes, 0 for unsuccessful aging or 1 for successful aging, as the dimensions of these matrices are 2*2, meaning there are two classes of data. The various performance measurements are defined in accordance with the results of the confusion matrix’s calculations. In order to comprehend how the datasets performed during the experiment utilizing the previously designed models, graphs and tables are also given and interpreted. Additionally, all three models’ performances are contrasted and critiqued.

Experimental results for SVM, NB, and ANN models

Experimental results on 50% of training

In this study, half the samples (1000 samples) were used for testing and the other half (1000 samples) for training. The confusion matrix in Figures 5, 6, and 7 were used to assess the ML algorithms (SVM, NB, and ANN). The confusion matrix was used to calculate the values of the performance measures. Figure 5 shows that the SVM model can accurately predict 550 cases of SA, whereas there were no cases of SA that were misdiagnosed. The model also properly predicted 183 non-SA cases while mispredicting 267 non-SA cases. The developed model yielded a 73.3% accuracy and a 67.3% specificity. Even though the SVM model had a sensitivity of 100%, it does not determine the total efficiency of the model when presented with unseen data in a large scale as it is evident in the F1 score (57.8%) of the said model. Figure 6 shows that the NB model can accurately predict 561 cases of SA, whereas there were no cases of SA that were misdiagnosed. Additionally, the model accurately forecasted 371 non-SA cases while mispredicting 68 non-SA cases. The generated model gave results with a precision of 84.5% and an accuracy of 93.2%. The NB model outperformed the SVM model by a large margin, and it is expected to be effective in forecasting successful aging in a fresh dataset. The confusion matrix for the ANN model on 50% of training, which correctly predicted 550 and 450 instances of SA and non-SA, respectively, is shown in Figure 7. With a 6.8% increase in accuracy, the ANN model with dropout fared better than the NB model.

Figure 5. Confusion matrix for SVM50 classifier.

Figure 6. Confusion matrix for NB50 classifier.

Figure 7. Confusion matrix for ANN50 classifier.

Experimental results on 70% of training

For this experiment, 70% of the samples (1400 samples) were assigned for training and 30% (600 samples) for testing. The confusion matrix is shown in Figures 8, 9, and 10 for the evaluation of the three machine learning methods (SVM, NB, and ANN). The confusion matrix was used to calculate the values of the performance measures. Figure 8 shows that the SVM model can accurately predict 330 SA instances, whereas there were no cases of SA that were misdiagnosed. Additionally, the model accurately forecasted 178 non-SA cases while mispredicting 92 non-SA cases. The SVM model that was designed generated accuracy and precision of 84.7% and 65.9%, respectively. Figure 9 shows that the NB model can accurately predict 332 SA instances, while there were zero cases of SA that were misdiagnosed. Additionally, the model accurately forecasted 224 non-SA cases while mispredicting 44 non-SA cases. The created model generated accuracy and precision of 92.7% and 83.6%, respectively. The NB model outperformed the SVM model by a large margin, and it is expected to be effective in forecasting successful aging in a fresh dataset. The confusion matrix of the ANN model, which was trained for 70% of the time, is shown in Figure 10, where 330 and 270 instances of SA and non-SA, respectively, were correctly predicted. With a percentage gain of 16.4%, the ANN model with dropout surpassed the NB model in terms of precision.

Figure 8. Confusion matrix for SVM70 classifier.

Figure 9. Confusion matrix for NB70 classifier.

Figure 10. Confusion matrix for ANN70 classifier.

ANN model accuracy and loss plots

Figure 11 shows the accuracy plot of the ANN model. Between 50 and 100 epochs, the train’s accuracy grew from 0.8 to 0.9. At 300 epochs, the validation accuracy was 0.92, and it grew from there. The reported average training accuracy was 100%. The graphic shows that the ANN model is not over-fitting, which suggests that the model can be used to successfully predict aging in the event of new data. Figure 12 shows the model’s training and validation losses. The loss function had reached its lowest after training, or 500 epochs, as seen in the graph. As supported by research, the ANN model’s low loss function has significance in providing accurate predictions on successful aging.

Figure 11. Training and validation accuracy of ANN50 model.

Figure 12. Training and validation loss of ANN50 model.

Performance of all the ML models

The performance results of all created ML models are displayed in Table 3. The number of individuals whose classifier has assigned them to a positive class (SA) and who are positive is calculated as part of the precision criterion. The proposed ANN model has the best performance by this criterion, whereas the SVM model is the least efficient technique. The NB50 model’s precision value is 93.2%, which is 0.5% higher than that of the NB70 model. The sensitivity criteria had the best value for the ANN/SVM/NB algorithms. The sensitivity criteria are crucial for locating every SA-positive person in the dataset. When it comes to identifying everyone who does not have SA, the ANN50/ANN70 model performs better than other ML models. This suggests that when these models achieve a value of 100%, they demonstrate the highest level of success in terms of specificity. When an algorithm can strike a favorable equilibrium between sensitivity and specificity, it is considered to be highly efficient. The best balance between these two requirements has been established by the optimal design suggested in this research. The F1-score, which takes into account both sensitivity and precision parameters, has a value of 100% in ANN-based models and is higher than that of the NB50/NB70 models (91.6%/91.1%).

Table 3. Evaluation of the efficiency of ML models.

Model	Accuracy (%)	Precision (%)	Specificity (%)	Sensitivity (%)	F1-Score (%)
NB50	93.2	84.5	89.2	100	91.6
SVM50	73.3	40.7	67.3	100	57.8
ANN50	100	100	100	100	100
NB70	92.7	83.6	88.3	100	91.1
SVM70	84.7	65.9	78.2	100	79.5
ANN70	100	100	100	100	100

Accuracy is the most fundamental and direct indicator of a classifier’s performance, and it typically manifests itself in the accurate detection of samples. Additionally, the SVM model has the lowest classification accuracy (73.3%), and the best classifier among the other models is the ANN-based approach, which has a value of 100%. Figure 13 shows a bar chart that contrasts the machine learning algorithms designed according to their F1 score, specificity, sensitivity, accuracy, and precision.

Figure 13. Bar graph showing performance of machine learning models.

Discussion

This study assessed subjective well-being by examining three key dimensions: physiological, cognitive psychological, and social functioning. This approach aligns with Rowe and Kahn’s theory.¹⁸ In order to determine if a person has SA or not, we developed prediction models that would utilize clinical and lifestyle factors as inputs. Our findings offer important new perspectives for determining SA likelihood. According to our main hypothesis, the proposed ML technique produced a potent SA status classifier. Here, we introduced a novel approach that uses three key ML techniques to forecast SA.

A prediction model can be created using a variety of ML methods. The numerous fundamental model assumptions in use today’s ML approaches preclude successful application. The optimal method for dealing with a dataset that is both highly variable and noisy, such as the case with successful aging data, which is inherently diverse and inconsistent, remains uncertain. This is because it is frequently challenging to verify fundamental assumptions. Furthermore, no single ML technique yields reliable prediction outcomes. Scientists and researchers are continually seeking well-trained machine learning models that demonstrate accurate and reliable performance on a consistent basis. However, in reality, only a few biased models can occasionally be produced, therefore training model outputs are not always flawless.

Table 4 presents a collection of studies that have explored the utilization of machine learning techniques to forecast and recognize instances of successful aging through diverse approaches. In this study, these studies were compared with the proposed approach. In addition, we provided an ANN-based technique to determine whether the tested individuals fall into the SA category or not. The data was preprocessed in the first phase to make it appropriate for use in data mining analysis. Thereafter, ANN was introduced, which proved more effective in predicting the SA than previously developed ML models such as SVM and NB. The results of the experiment demonstrate that by employing a holdout validation approach, the predictive system achieved outstanding performance in forecasting SA. It achieved perfect precision, specificity, accuracy, and F1-score, all reaching a flawless 100% level.

Table 4. Comparison of the proposed optimal model with existing methods for the prediction of Successful Aging.

Authors	Purpose	Classifier(s)	Accuracy (%)
⁴¹	Prediction of mental impairment in aged people	Ensemble classifier	87.4
²⁷	Prediction of SA maintenance	Deep learning	89.3
²⁶	Prediction of SA	Ensemble KNN	89.6
²⁸	Prediction of SA	Random Forest	95
Implemented work	Prediction of SA	Artificial Neural Network	100

Based on the findings of the current study, the examined machine learning (ML) methods demonstrate consistent accuracy in predicting successful aging (SA) in older individuals. The computed metrics indicate that the ML models, trained using specific attributes, accurately predicted SA. The use of past theoretical and empirical studies in feature selection may improve prediction accuracy by successfully reducing the number of irrelevant or superfluous variables in the model. In order to prevent overfitting, dropout regularization was also used, which improved the accuracy of the ANN model and extended the scope of its application.

Even though our study performed as well as it could have for estimating the SA in older persons, there were a few potential weaknesses that need to be mentioned. First, a dataset from a single database that affects the accuracy, completeness, and generalizability of data was retrospectively examined for this study. The prediction models could have been negatively impacted by several inconsistent, insufficient, inaccurate, and irregular data items while utilizing this dataset. Therefore, the standard selection of each variable was established through conversations with the gerontology experts in order to increase data homogeneity in for the pre-processing stage. The aforementioned method made it easier for us to extract the most useful raw data from the hospital’s electronic medical record. Second, using a 2000-item sample dataset, this study only applied three significant ML algorithms. Obviously, the accuracy and generalizability of our models would be more enhanced as we assess a wider range of machine learning methods using larger, multicenter, and prospective datasets. Thirdly, the outcomes of the current investigation should be confirmed using an external validation approach. Fourthly, the link between the predictor and outcome factors was not examined in this study. There is a need for future research to explore a series of long-term factors associated with successful aging including disability, mental illness (depression, schizophrenia), substance abuse, and poor quality of life. Additionally, the suggested model can also be enhanced to provide scalable prediction models for successful aging.

Conclusion

In this study, the assessment of successful aging (SA) was carried out by considering three aspects of a persons health state, namely: physiological, cognitive psychological, and social function. The study built upon the SA model proposed by Rowe and Kahn and employed machine learning techniques to accurately predict the aging process. The study’s findings indicated that using the proposed machine learning (ML) model has the capability to improve the prediction of SA, suggesting it as a promising approach. The accuracy of all six ML models employed in this study were on average, >90%, >87%, and >86% to predict successful aging in the aging population dataset. The developed ANN models, however, reached 100% accuracy and sensitivity. The results show that machine learning offers a promising method for improving SA prediction. The results in this study have the potential to greatly benefit geriatricians and senior nurses by enhancing their ability to provide excellent assistance and care to the elderly. Additionally, the presented prediction models can serve as a reliable and adaptable tool for healthcare executives and policymakers, enabling them to enhance geriatric outcomes. For future directions, improved data instances that better point out factors important to successful aging will be acquired to developed scalable and generalizable models for predicting successful aging across different sources of data.

Declarations

Authors contribution

JEZ – Conceptualization, Methodology, Software, Result Analysis and Writing (Original draft preparation, final review, and editing, VAO – Software, Data Curation, and Original Draft, AJO – Investigation, Visualization, Writing (final review and editing), ASO – Software, Data Curation, Methodology, Writing(final review and editing). All authors read and approved the manuscript.

Data availability

Zenodo: Successful Aging Dataset for Elderly Patients10.5281/zenodo.8132494

This project contains the following underlying data:

• SAData.xlsx. (anonymized records of elderly individuals for the prediction of successful aging. 2000 records in total.).

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Software availability

• Software code available from: https://zenodo.org/record/8184228

License: software are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

References

1. Lin Y-H, Chen Y-C, Tseng Y-C, et al.: Physical activity and successful aging among middle-aged and older adults: a systematic review and meta-analysis of cohort studies. Aging (Albany NY). 2020; 12(9): 7704–7716. PubMed Abstract | Publisher Full Text | Free Full Text
2. Chandraa CE, Abdullaha S: Forecasting mortality trend of Indonesian old aged population with bayesian method. Int. J. Adv. Sci. Eng. Inf. Technol. 2022; 12(2): 580–588. Publisher Full Text
3. Seyda Seydel G, Kucukoglu O, Altinbasv A, et al.: Economic growth leads to increase of obesity and associated hepatocellular carcinoma in developing countries. Ann. Hepatol. 2016; 15(5): 662–672. PubMed Abstract | Publisher Full Text
4. Mbam KC, Halvorsen CJ, Okoye UO: Aging in Nigeria: A Growing Population of Older Adults Requires the Implementation of National Aging Policies. Gerontologist. 2022 Oct 19; 62(9): 1243–1250. PubMed Abstract | Publisher Full Text
5. Wang Q, Li L: The effects of population aging, life expectancy, unemployment rate, population density, per capita GDP, urbanization on per capita carbon emissions. Sustain Product Consum. 2021; 28: 760–774. Publisher Full Text
6. Kiziltan M: The Effects of Population Aging and Life Expectancy on Economic Growth: The Case of Emerging Market Economies.Bayar Y, editor. Handbook of Research on Economic and Social Impacts of Population Aging. IGI Global; 2021; pp. 97–118. ch007. Publisher Full Text
7. Seong MH, Shin E, Sok S: Successful aging perception in middle-aged korean men: aq methodology approach. Int. J. Environ. Res. Public Health. 2021; 18(6): 3095. PubMed Abstract | Publisher Full Text | Free Full Text
8. Lin L, Wang HH, Lu C, et al.: Adverse childhood experiences and subsequent chronic diseases among middle-aged or older adults in China and associations with demographic and socioeconomic characteristics. JAMA Netw. Open. 2021; 4(10): e2130143–e2130143. PubMed Abstract | Publisher Full Text | Free Full Text
9. Ferrucci L, Gonzalez-Freire M, Fabbri E, et al.: Measuring biological aging in humans: a quest. Aging Cell. 2020; 19(2): e13080. PubMed Abstract | Publisher Full Text
10. Lin E, Lin C-H, Lane H-Y: Prediction of functional outcomes of schizophrenia with genetic biomarkers using a bagging ensemble machine learning method with feature selection. Sci. Rep. 2021; 11(1): 1–8.
11. Nosraty L, Pulkki J, Raitanen J, et al.: Successful aging as a predictor of long-term care among oldest old: the vitality 90+ study. J. Appl. Gerontol. 2019; 38(4): 553–571. PubMed Abstract | Publisher Full Text
12. Mendoza-Nunez VM, Pulido-Castillo G, Correa-Munoz E, et al.: Effect of a community gerontology program on the control of metabolicsyndrome in mexican older adults. Healthcare. 2022; 10(3): 466. PubMed Abstract | Publisher Full Text | Free Full Text
13. Teater B, Chonody JM: What attributes of successful aging are important to older adults? The development of a multidimensional definition of successful aging. Soc. Work Health Care. 2020; 59(3): 161–179. PubMed Abstract | Publisher Full Text
14. Bowling A: Aspirations for older age in the 21st century: What is successful aging? Int. J. Aging Hum. Dev. 2007; 64(3): 263–297. PubMed Abstract | Publisher Full Text
15. Bosnes I, Nordahl HM, Stordal E, et al.: Lifestyle predictors of successful aging: a 20-year prospective HUNT study. PLoS One. 2019; 14(7): e0219200. PubMed Abstract | Publisher Full Text | Free Full Text
16. Rowe JW, Kahn RL: Successful aging. Gerontologist. 1997; 37(4): 433–440. Publisher Full Text
17. Shafiee M, Hazrati M, Motalebi SA, et al.: Can healthy life style predict successful aging among Iranian older adults? Med. J. Islam Repub. Iran. 2020; 34: 139.
18. Chiao CY, Hsiao CY: Comparison of personality traits and successful aging in older Taiwanese. Geriatr. Gerontol Int. 2017; 17(11): 2239–2246. PubMed Abstract | Publisher Full Text
19. Dorji L, Jullamate P, Subgranon R, et al.: Predicting factors of successful aging among community dwelling older adults in Thimphu, Bhutan. Bangkok Med. J. 2019; 15(1): 38–43. Publisher Full Text
20. Ng TP, Broekman BF, Niti M, et al.: Determinants of successful aging using a multidimensional definition among Chinese elderly in Singapore. Am. J. Geriatr. Psychiatry. 2009; 17(5): 407–416. PubMed Abstract | Publisher Full Text
21. Anton SD, Woods AJ, Ashizawa T, et al.: Successful aging: advancing the science of physical independence in older adults. Ageing Res. Rev. 2015; 24: 304–327. PubMed Abstract | Publisher Full Text | Free Full Text
22. Liu H, Byles JE, Xu X, et al.: Evaluation of successful aging among older people in China: results from China health and retirement longitudinal study. Geriatr. Gerontol. Int. 2017; 17(8): 1183–1190. PubMed Abstract | Publisher Full Text
23. Canêdo AC, Lopes CS, Lourenço RA: Prevalence of and factors associated with successful aging in Brazilian older adults: frailty in Brazilian older people study (FIBRA RJ). Geriatr. Gerontol. Int. 2018; 18(8): 1280–1285. PubMed Abstract | Publisher Full Text
24. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text
25. Raza K: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-Healthcare Monitoring Systems. Elsevier; 2019; pp. 179–196. Publisher Full Text
26. Asghari Varzaneh Z, Shanbehzadeh M, Kazemi-Arpanahi H: Prediction of successful aging using ensemble machine learning algorithms. BMC Med. Inform. Decis. Mak. 2022; 22: 258. PubMed Abstract | Publisher Full Text | Free Full Text
27. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text
28. Maryam A, Raoof N, Somayeh N: Designing a Predictive Model for Successful Aging among the Elderly Using Machine Learning Techniques.2022. Publisher Full Text
29. Nagarajan NR, Teixeira AA, Silva ST: Ageing population: identifying the determinants of ageing in the least developed countries. Popul. Res. Policy Rev. 2021; 40(2): 187–210. Publisher Full Text
30. Olson DL: Data set balancing. In: Chinese Academy of Sciences Symposium on Data Mining and Knowledge Management. Berlin, Heidelberg: Springer; 2004 Jul 12; 71–80.
31. Chandrashekar G, Sahin F: A survey on feature selection methods. Comput. Electr. Eng. 2014; 40(1): 16–28. Publisher Full Text
32. Li J, Cheng K, Wang S, et al.: Feature selection: a data perspective. ACM Comput. Surv. 2017; 50(6): 1–45. Publisher Full Text
33. Guan Z-J, Li R, Jiang J-T, et al.: Data mining and design of electromagnetic properties of Co/FeSi filled coatings based on genetic algorithms optimized artificial neural networks (GA-ANN). Compos. B Eng. 2021; 226: 109383. Publisher Full Text
34. Sinaga LM, Suwilo S: Analysis of classification and Naïve Bayes algorithm k-nearest neighbor in data mining. IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020; Vol. 725(1): p. 012106.
35. Sembiring M, Tambunan R: Analysis of graduation prediction on time based on student academic performance using the Naïve Bayes Algorithm with data mining implementation (Case study: Department of Industrial Engineering USU). IOP Conference Series: Materials Science and Engineering: 2021. IOP Publishing; 2021; p. 012069.
36. Gopinath C, Manikanta J: Performance Analysis Based on Data Mining Technique in Predicting the Diabetic Disease-Decision tree and Naïve Bayes. 2019 1st International Conference on Advances in Information Technology (ICAIT): 2019. IEEE; 2019; pp. 525–528.
37. Prasetya R, Ridwan A: Data mining application on weather prediction using classification tree, naïve bayes and K-nearest neighbor algorithm with model testing of supervised learning probabilistic brier score, confusion matrix and ROC. J. Appl. Commun. Inf. Technol. 2020; 4(2): 25–33. Publisher Full Text
38. Khazaei S, Najafi-GhOBADI S, Ramezani-Doroh V: Construction data mining methods in the prediction of death in hemodialysis patients using support vector machine, neural network, logistic regression and decision tree. J. Prev. Med. Hyg. 2021; 62(1): E222–E230. PubMed Abstract | Publisher Full Text
39. Chidambaram S, Srinivasagan K: Performance evaluation of support vector machine classification approaches in data mining. Clust. Comput. 2019; 22(1): 189–196. Publisher Full Text
40. Srivastava N, Hinton G, Krizhevsky A, et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014; 15(1): 1929–1958. Publisher Full Text
41. Byeon H: Exploring factors for predicting anxiety disorders of the elderly living alone in South Korea using interpretable machine learning: a population-based study. Int. J. Environ. Res. Public Health. 2021; 18(14): 7625. PubMed Abstract | Publisher Full Text | Free Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 25 Sep 2023

Author details Author details

¹ Biomedical Engineering, Afe Babalola University, Ado Ekiti, Ekiti, 23401, Nigeria
² Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Tamil Nadu, Chennai, India
³ Department of Electrical/Electronics and Computer Engineering, Afe Babalola University, Ado Ekiti, Ekiti, Nigeria

Jesuloluwa Zaccheus
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation

Victoria Atogwe
Roles: Data Curation, Investigation, Validation, Writing – Original Draft Preparation

Ayodele Oyejide
Roles: Formal Analysis, Methodology, Software, Validation

Ayodeji Olalekan Salau
Roles: Conceptualization, Data Curation, Investigation, Methodology, Supervision, Validation, Visualization, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (2)

version 2

Revised

Published: 03 Apr 2024, 12:1201

https://doi.org/10.12688/f1000research.138608.2

version 1

Published: 25 Sep 2023, 12:1201

https://doi.org/10.12688/f1000research.138608.1

Copyright

© 2023 Zaccheus J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Zaccheus J, Atogwe V, Oyejide A and Salau AO. Towards successful aging classification using machine learning algorithms [version 1; peer review: 2 approved with reservations, 1 not approved]. F1000Research 2023, 12:1201 (https://doi.org/10.12688/f1000research.138608.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 1

VERSION 1

PUBLISHED 25 Sep 2023

Views

9

Reviewer Report 06 Mar 2024

Peter Fedichev, Gero PTE. LTD, Singapore, Singapore

Approved with Reservations

https://doi.org/10.5256/f1000research.151819.r246031

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection ... Continue reading

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection and accuracy metrics.

Although technically proficient, the small sample size poses limitations on model complexity and validation robustness. To enhance validation, I recommend cross-validation with external cohorts (e.g., NHANES, UK Biobank), ensuring the model's features are universally applicable.

Additionally, the manuscript employs a linear classifier for SA, based on continuous predictors (log odds ratio). It's crucial to evaluate how this correlates with established aging measures like the frailty index (FI) or biological age (BA). A comparison with log-linear classifiers using FI or BA, if data permits, could enrich the manuscript.

I understand that integrating these comparisons might be demanding. I would leave to the authors if they would want to add new results or leave it for a potential future exploration.

However, I believe that a fair discussion of the sample size, model limitations and the relationship between the SA and FI and BA, all supported by recent literature, is essential. Notable references include Kenneth Rockwood's work on FI (Ref [1]) and recent reviews on BA by the biomarkers of aging consortium (Ref [2]).

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. Blodgett J, Theou O, Kirkland S, Andreou P, et al.: Frailty in NHANES: Comparing the frailty index and phenotype.Arch Gerontol Geriatr. 2015; 60 (3): 464-70 PubMed Abstract | Publisher Full Text
2. Moqri M, Herzog C, Poganik JR, Ying K, et al.: Validation of biomarkers of aging.Nat Med. 2024; 30 (2): 360-372 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: theories of aging, AI/ML in biology and drug discovery, systems biology, biomarkers of aging, drug discovery against aging and age-related diseases

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

12

Reviewer Report 26 Feb 2024

Jiao Yu, Yale University, Yale, USA

Approved with Reservations

https://doi.org/10.5256/f1000research.151819.r246029

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of ... Continue reading

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of various machine learning models for successful aging prediction. Among these models, the artificial neural network (ANN) demonstrates the best performance. While the study's main findings highlight the potential of machine learning in predicting successful aging, certain limitations require consideration.

Abstract: Include a clear sentence in the background section stating the specific problem your research aims to address.

According to the Rowe and Kahn’s (1997) framework, successful aging was frequently operationalized by the absence of major diseases, lack of activity of daily living (ADL) disabilities, high levels of physical and cognitive functioning, and active social engagement. It is unclear how the Rowe and Khan approach was applied to create the outcome variable for successful aging classification.

It is unclear what those limitations were in previous works. Did the previous studies fail to address the limitations mentioned in this analysis? such that they failed to select significant aging-related features?

Concerns also arise regarding the justification of the sample size. The authors need to provide a rationale for why the chosen sample size is deemed sufficient for the study's objectives.

It is unclear as to how the feature selection was conducted. It is unclear whether the machine learning models utilized in this study also served as feature selection algorithms or if an alternative approach was adopted.

I am also concerned that the high accuracy of ANN may result from overfitting, particularly in the absence of independent validation data. This may also be due to the fact that predictors are highly correlated with your outcome. You may consider a correlation analysis between predictor variables and the outcome variable. If high correlations exist, discuss their impact on model estimation.

While the ANN method shows the best performance in this dataset, its generalizability remains uncertain. How these methods can be applied in practice needs further development in the discussion.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: aging, health disparities

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

9

Reviewer Report 19 Feb 2024

Brian H Chen, UC San Diego, San Diego, California, USA

Not Approved

https://doi.org/10.5256/f1000research.151819.r235047

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise ... Continue reading

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise since the authors seem to be using input variables that are part of the definition of the outcome. Furthermore, there are insufficient details on provided on the methodology.

Here are my suggestions:

1) The use of Rowe and Kahn definition of "Successful Aging" is not widely accepted. That said, it was not clear how the principles posed by Rowe and Kahn were applied to create the outcome variable.

2) The authors should make clearer why they selected the input variables for SA classification. I imagine the presence of some of these input variables (e.g., diseases, ADLs, QOL) are part of the definition of "SA." Are the authors merely trying to recreate Rowe & Kahn's model with the addition of other variables?

3) The arbitrary selection of machine learning algorithms needs further justification. The SVM and Naive Bayes approaches, I would predict a priori, to perform worse than any neural network.

4) The authors were unclear as to how the feature selection was actually conducted. We will need more detail than those variables being "most pertinent to SA."

5) The authors did not validate their models using an independent sample, which explains their overly optimistic model performance.

6) The confusion matrix in Figure 4 is incorrect.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Aging biomarkers and prediction modeling using machine learning.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 25 Sep 2023

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2	3	4	5
Version 2 (revision) 03 Apr 24				read	read
Version 1 25 Sep 23	read	read	read

Brian H Chen, UC San Diego, San Diego, USA
Jiao Yu, Yale University, Yale, USA
Peter Fedichev, Gero PTE. LTD, Singapore, Singapore
Tagne Poupi Theodore Armand, Inje University, Gimhae, South Korea
Larissa Pruner Marques, Oswaldo Cruz Foundation, Rio de Janeiro,, Rio de Janeiro, Brazil

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

3 Views

06 Nov 2024 | for Version 2

Larissa Pruner Marques, Oswaldo Cruz Foundation, Rio de Janeiro,, Rio de Janeiro, Brazil

3 Views Cite this report Responses(0)

Approved With Reservations

This study investigates the application of machine learning (ML) techniques to classify individuals into successful aging (SA) and non-successful aging categories, emphasizing the importance of a positive perspective on aging. By utilizing geriatric data from a hospital in Nigeria, the researchers employed three ML methods, including artificial neural networks (ANN), support vector machines, and Naive Bayes, to analyze a sample of 2,000 individuals.
Here’s a few topics for improvements:

Theoretical Definition of Successful Aging: The authors define successful aging (SA) in the abstract as being free from diseases, relying on a theoretical framework that may overly constrain the concept. This perspective does not adequately reflect the adaptability and multifactorial nature of aging, which is more comprehensively discussed in the introduction. By presenting such a limiting definition upfront, the article may mislead readers regarding the broader understanding of successful aging.
Potential Ageist Terminology: The use of terms like "geriatric data" and "elderly" throughout the article could inadvertently promote ageism. It is important for the authors to revisit and reconsider these terms, opting for language that is more respectful and inclusive of older adults, thereby fostering a more positive discourse around aging and medical terms.
Citing References: In the sixth paragraph of the introduction, the authors reference previous works using the term "Ref. 26" instead of the authors' surnames. This could disrupt the flow of reading and diminish the credibility of the sources. The authors should ensure proper citation by using the authors' names for clarity and engagement.
Introduction Structure: The final paragraph of the introduction begins with the study's objective but then shifts to discussing methodology. While providing justification for the research is important, this paragraph would benefit from a more cohesive structure that concludes with the research objective rather than detailing the methodology. This adjustment would enhance clarity and maintain focus.
Clarity on Variables for Successful Aging: A significant limitation of the article is the lack of a clear framework for defining and measuring the variables associated with successful aging. The authors should specify how they assess nutritional status, daily living activities, and quality of life, including the cutoff points for classification. This clarity is essential for understanding the model's reproducibility, as emphasized by the authors in their conclusion.
Table Placement: The inclusion of a table in the discussion section is unconventional and may confuse readers. The authors might consider repositioning the table as supplementary material to maintain the integrity of the discussion.
Data Availability: I commend the authors for their commitment to transparency and openness by making the data available on Zenodo. This is a commendable practice that supports the reproducibility and integrity of research findings.

By addressing these critiques, the authors can enhance the clarity, inclusivity, and overall quality of their article.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

public health, multimorbidity, quality of life, aging

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

5 Views

18 Oct 2024 | for Version 2

Tagne Poupi Theodore Armand, Inje University, Gimhae, South Korea

5 Views Cite this report Responses(0)

Approved With Reservations

Your Report

The research paper presented by Zaccheus et al. uses three state-of-the-art machine learning models to classify successful aging (SA). The authors used 2000 sample data from individuals from Afe Babalola University Multi-System Hospital in this research. The results are outstanding, with a report of 100% accuracy, 100% sensitivity, and 100% precision, but it seems unrealistic. Though the method employed is not new, the case study in SA can be significant for science, with positive outcomes for patients considering the size of the aging population worldwide.

Here are some points that can improve the manuscript.

1—In the data collection process, the authors mentioned the non-exclusion of missing values and said they used the mean imputation method. Dealing with medical data requires more explanation and sometimes expert intervention for missing data handling processes, which must be revised. It remains unclear whether the input mean values did not bias the dataset.

2- It is unclear why the author used two data split configurations of 70/30 and 50/50 for training. They further indicated that 20% was used for validation; in the experimental detail, they mention 30% for testing. The data split into training-testing and validation remains confusing.

3—The feature selection process is not sufficiently described. The authors mentioned consulting gerontologists but said they selected some of the most relevant features. Did you apply any additional techniques to determine future importance? If yes, describe it.

4—Though an external validation set may be difficult to obtain, it can further confirm the results.

5- There is no clear justification for the choice of machine learning algorithms.

6- Is there any clear conclusion about using two proportions of datasets?

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

9 Views

06 Mar 2024 | for Version 1

Peter Fedichev, Gero PTE. LTD, Singapore, Singapore

9 Views Cite this report Responses(0)

Approved With Reservations

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection and accuracy metrics.

Although technically proficient, the small sample size poses limitations on model complexity and validation robustness. To enhance validation, I recommend cross-validation with external cohorts (e.g., NHANES, UK Biobank), ensuring the model's features are universally applicable.

Additionally, the manuscript employs a linear classifier for SA, based on continuous predictors (log odds ratio). It's crucial to evaluate how this correlates with established aging measures like the frailty index (FI) or biological age (BA). A comparison with log-linear classifiers using FI or BA, if data permits, could enrich the manuscript.

I understand that integrating these comparisons might be demanding. I would leave to the authors if they would want to add new results or leave it for a potential future exploration.

However, I believe that a fair discussion of the sample size, model limitations and the relationship between the SA and FI and BA, all supported by recent literature, is essential. Notable references include Kenneth Rockwood's work on FI (Ref [1]) and recent reviews on BA by the biomarkers of aging consortium (Ref [2]).

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. Blodgett J, Theou O, Kirkland S, Andreou P, et al.: Frailty in NHANES: Comparing the frailty index and phenotype.Arch Gerontol Geriatr. 2015; 60 (3): 464-70 PubMed Abstract | Publisher Full Text
2. Moqri M, Herzog C, Poganik JR, Ying K, et al.: Validation of biomarkers of aging.Nat Med. 2024; 30 (2): 360-372 PubMed Abstract | Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

theories of aging, AI/ML in biology and drug discovery, systems biology, biomarkers of aging, drug discovery against aging and age-related diseases

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

12 Views

26 Feb 2024 | for Version 1

Jiao Yu, Yale University, Yale, USA

12 Views Cite this report Responses(0)

Approved With Reservations

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of various machine learning models for successful aging prediction. Among these models, the artificial neural network (ANN) demonstrates the best performance. While the study's main findings highlight the potential of machine learning in predicting successful aging, certain limitations require consideration.

Abstract: Include a clear sentence in the background section stating the specific problem your research aims to address.

According to the Rowe and Kahn’s (1997) framework, successful aging was frequently operationalized by the absence of major diseases, lack of activity of daily living (ADL) disabilities, high levels of physical and cognitive functioning, and active social engagement. It is unclear how the Rowe and Khan approach was applied to create the outcome variable for successful aging classification.

It is unclear what those limitations were in previous works. Did the previous studies fail to address the limitations mentioned in this analysis? such that they failed to select significant aging-related features?

Concerns also arise regarding the justification of the sample size. The authors need to provide a rationale for why the chosen sample size is deemed sufficient for the study's objectives.

It is unclear as to how the feature selection was conducted. It is unclear whether the machine learning models utilized in this study also served as feature selection algorithms or if an alternative approach was adopted.

I am also concerned that the high accuracy of ANN may result from overfitting, particularly in the absence of independent validation data. This may also be due to the fact that predictors are highly correlated with your outcome. You may consider a correlation analysis between predictor variables and the outcome variable. If high correlations exist, discuss their impact on model estimation.

While the ANN method shows the best performance in this dataset, its generalizability remains uncertain. How these methods can be applied in practice needs further development in the discussion.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

aging, health disparities

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

9 Views

19 Feb 2024 | for Version 1

Brian H Chen, UC San Diego, San Diego, California, USA

9 Views Cite this report Responses(0)

Not Approved

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise since the authors seem to be using input variables that are part of the definition of the outcome. Furthermore, there are insufficient details on provided on the methodology.

Here are my suggestions:

1) The use of Rowe and Kahn definition of "Successful Aging" is not widely accepted. That said, it was not clear how the principles posed by Rowe and Kahn were applied to create the outcome variable.

2) The authors should make clearer why they selected the input variables for SA classification. I imagine the presence of some of these input variables (e.g., diseases, ADLs, QOL) are part of the definition of "SA." Are the authors merely trying to recreate Rowe & Kahn's model with the addition of other variables?

3) The arbitrary selection of machine learning algorithms needs further justification. The SVM and Naive Bayes approaches, I would predict a priori, to perform worse than any neural network.

4) The authors were unclear as to how the feature selection was actually conducted. We will need more detail than those variables being "most pertinent to SA."

5) The authors did not validate their models using an independent sample, which explains their overly optimistic model performance.

6) The confusion matrix in Figure 4 is incorrect.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Aging biomarkers and prediction modeling using machine learning.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] 1. Lin Y-H, Chen Y-C, Tseng Y-C, et al.: Physical activity and successful aging among middle-aged and older adults: a systematic review and meta-analysis of cohort studies. Aging (Albany NY). 2020; 12(9): 7704–7716. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Chandraa CE, Abdullaha S: Forecasting mortality trend of Indonesian old aged population with bayesian method. Int. J. Adv. Sci. Eng. Inf. Technol. 2022; 12(2): 580–588. Publisher Full Text

[3] 3. Seyda Seydel G, Kucukoglu O, Altinbasv A, et al.: Economic growth leads to increase of obesity and associated hepatocellular carcinoma in developing countries. Ann. Hepatol. 2016; 15(5): 662–672. PubMed Abstract | Publisher Full Text

[4] 4. Mbam KC, Halvorsen CJ, Okoye UO: Aging in Nigeria: A Growing Population of Older Adults Requires the Implementation of National Aging Policies. Gerontologist. 2022 Oct 19; 62(9): 1243–1250. PubMed Abstract | Publisher Full Text

[5] 5. Wang Q, Li L: The effects of population aging, life expectancy, unemployment rate, population density, per capita GDP, urbanization on per capita carbon emissions. Sustain Product Consum. 2021; 28: 760–774. Publisher Full Text

[6] 6. Kiziltan M: The Effects of Population Aging and Life Expectancy on Economic Growth: The Case of Emerging Market Economies.Bayar Y, editor. Handbook of Research on Economic and Social Impacts of Population Aging. IGI Global; 2021; pp. 97–118. ch007. Publisher Full Text

[7] 7. Seong MH, Shin E, Sok S: Successful aging perception in middle-aged korean men: aq methodology approach. Int. J. Environ. Res. Public Health. 2021; 18(6): 3095. PubMed Abstract | Publisher Full Text | Free Full Text

[8] 8. Lin L, Wang HH, Lu C, et al.: Adverse childhood experiences and subsequent chronic diseases among middle-aged or older adults in China and associations with demographic and socioeconomic characteristics. JAMA Netw. Open. 2021; 4(10): e2130143–e2130143. PubMed Abstract | Publisher Full Text | Free Full Text

[9] 9. Ferrucci L, Gonzalez-Freire M, Fabbri E, et al.: Measuring biological aging in humans: a quest. Aging Cell. 2020; 19(2): e13080. PubMed Abstract | Publisher Full Text

[10] 10. Lin E, Lin C-H, Lane H-Y: Prediction of functional outcomes of schizophrenia with genetic biomarkers using a bagging ensemble machine learning method with feature selection. Sci. Rep. 2021; 11(1): 1–8.

[11] 11. Nosraty L, Pulkki J, Raitanen J, et al.: Successful aging as a predictor of long-term care among oldest old: the vitality 90+ study. J. Appl. Gerontol. 2019; 38(4): 553–571. PubMed Abstract | Publisher Full Text

[12] 12. Mendoza-Nunez VM, Pulido-Castillo G, Correa-Munoz E, et al.: Effect of a community gerontology program on the control of metabolicsyndrome in mexican older adults. Healthcare. 2022; 10(3): 466. PubMed Abstract | Publisher Full Text | Free Full Text

[13] 13. Teater B, Chonody JM: What attributes of successful aging are important to older adults? The development of a multidimensional definition of successful aging. Soc. Work Health Care. 2020; 59(3): 161–179. PubMed Abstract | Publisher Full Text

[14] 14. Bowling A: Aspirations for older age in the 21st century: What is successful aging? Int. J. Aging Hum. Dev. 2007; 64(3): 263–297. PubMed Abstract | Publisher Full Text

[15] 15. Bosnes I, Nordahl HM, Stordal E, et al.: Lifestyle predictors of successful aging: a 20-year prospective HUNT study. PLoS One. 2019; 14(7): e0219200. PubMed Abstract | Publisher Full Text | Free Full Text

[16] 16. Rowe JW, Kahn RL: Successful aging. Gerontologist. 1997; 37(4): 433–440. Publisher Full Text

[17] 17. Shafiee M, Hazrati M, Motalebi SA, et al.: Can healthy life style predict successful aging among Iranian older adults? Med. J. Islam Repub. Iran. 2020; 34: 139.

[18] 18. Chiao CY, Hsiao CY: Comparison of personality traits and successful aging in older Taiwanese. Geriatr. Gerontol Int. 2017; 17(11): 2239–2246. PubMed Abstract | Publisher Full Text

[19] 19. Dorji L, Jullamate P, Subgranon R, et al.: Predicting factors of successful aging among community dwelling older adults in Thimphu, Bhutan. Bangkok Med. J. 2019; 15(1): 38–43. Publisher Full Text

[20] 20. Ng TP, Broekman BF, Niti M, et al.: Determinants of successful aging using a multidimensional definition among Chinese elderly in Singapore. Am. J. Geriatr. Psychiatry. 2009; 17(5): 407–416. PubMed Abstract | Publisher Full Text

[21] 21. Anton SD, Woods AJ, Ashizawa T, et al.: Successful aging: advancing the science of physical independence in older adults. Ageing Res. Rev. 2015; 24: 304–327. PubMed Abstract | Publisher Full Text | Free Full Text

[22] 22. Liu H, Byles JE, Xu X, et al.: Evaluation of successful aging among older people in China: results from China health and retirement longitudinal study. Geriatr. Gerontol. Int. 2017; 17(8): 1183–1190. PubMed Abstract | Publisher Full Text

[23] 23. Canêdo AC, Lopes CS, Lourenço RA: Prevalence of and factors associated with successful aging in Brazilian older adults: frailty in Brazilian older people study (FIBRA RJ). Geriatr. Gerontol. Int. 2018; 18(8): 1280–1285. PubMed Abstract | Publisher Full Text

[24] 24. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text

[25] 25. Raza K: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-Healthcare Monitoring Systems. Elsevier; 2019; pp. 179–196. Publisher Full Text

[26] 26. Asghari Varzaneh Z, Shanbehzadeh M, Kazemi-Arpanahi H: Prediction of successful aging using ensemble machine learning algorithms. BMC Med. Inform. Decis. Mak. 2022; 22: 258. PubMed Abstract | Publisher Full Text | Free Full Text

[27] 27. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text

[28] 28. Maryam A, Raoof N, Somayeh N: Designing a Predictive Model for Successful Aging among the Elderly Using Machine Learning Techniques.2022. Publisher Full Text

[29] 29. Nagarajan NR, Teixeira AA, Silva ST: Ageing population: identifying the determinants of ageing in the least developed countries. Popul. Res. Policy Rev. 2021; 40(2): 187–210. Publisher Full Text

[30] 30. Olson DL: Data set balancing. In: Chinese Academy of Sciences Symposium on Data Mining and Knowledge Management. Berlin, Heidelberg: Springer; 2004 Jul 12; 71–80.

[31] 31. Chandrashekar G, Sahin F: A survey on feature selection methods. Comput. Electr. Eng. 2014; 40(1): 16–28. Publisher Full Text

[32] 32. Li J, Cheng K, Wang S, et al.: Feature selection: a data perspective. ACM Comput. Surv. 2017; 50(6): 1–45. Publisher Full Text

[33] 33. Guan Z-J, Li R, Jiang J-T, et al.: Data mining and design of electromagnetic properties of Co/FeSi filled coatings based on genetic algorithms optimized artificial neural networks (GA-ANN). Compos. B Eng. 2021; 226: 109383. Publisher Full Text

[34] 34. Sinaga LM, Suwilo S: Analysis of classification and Naïve Bayes algorithm k-nearest neighbor in data mining. IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020; Vol. 725(1): p. 012106.

[35] 35. Sembiring M, Tambunan R: Analysis of graduation prediction on time based on student academic performance using the Naïve Bayes Algorithm with data mining implementation (Case study: Department of Industrial Engineering USU). IOP Conference Series: Materials Science and Engineering: 2021. IOP Publishing; 2021; p. 012069.

[36] 36. Gopinath C, Manikanta J: Performance Analysis Based on Data Mining Technique in Predicting the Diabetic Disease-Decision tree and Naïve Bayes. 2019 1st International Conference on Advances in Information Technology (ICAIT): 2019. IEEE; 2019; pp. 525–528.

[37] 37. Prasetya R, Ridwan A: Data mining application on weather prediction using classification tree, naïve bayes and K-nearest neighbor algorithm with model testing of supervised learning probabilistic brier score, confusion matrix and ROC. J. Appl. Commun. Inf. Technol. 2020; 4(2): 25–33. Publisher Full Text

[38] 38. Khazaei S, Najafi-GhOBADI S, Ramezani-Doroh V: Construction data mining methods in the prediction of death in hemodialysis patients using support vector machine, neural network, logistic regression and decision tree. J. Prev. Med. Hyg. 2021; 62(1): E222–E230. PubMed Abstract | Publisher Full Text

[39] 39. Chidambaram S, Srinivasagan K: Performance evaluation of support vector machine classification approaches in data mining. Clust. Comput. 2019; 22(1): 189–196. Publisher Full Text

[40] 40. Srivastava N, Hinton G, Krizhevsky A, et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014; 15(1): 1929–1958. Publisher Full Text

[41] 41. Byeon H: Exploring factors for predicting anxiety disorders of the elderly living alone in South Korea using interpretable machine learning: a population-based study. Int. J. Environ. Res. Public Health. 2021; 18(14): 7625. PubMed Abstract | Publisher Full Text | Free Full Text

Towards successful aging classification using machine learning algorithms

Abstract

Keywords

Introduction

Methodology

System architecture

Figure 1. Proposed system architecture.

Study parameters

Data collection and pre-processing

Feature selection

Figure 2. Correlation between features of the dataset.

Figure 3. Cross-section of the input dataset.

Development of classification models

Table 1. Model hyperparameters.

Design of classification models

Evaluation of the machine learning models

Figure 4. Confusion matrix.

Table 2. Performance metrics formula.

Results and discussion

Experimental results for SVM, NB, and ANN models

Figure 5. Confusion matrix for SVM50 classifier.

Figure 6. Confusion matrix for NB50 classifier.

Figure 7. Confusion matrix for ANN50 classifier.

Figure 8. Confusion matrix for SVM70 classifier.

Figure 9. Confusion matrix for NB70 classifier.

Figure 10. Confusion matrix for ANN70 classifier.

ANN model accuracy and loss plots

Figure 11. Training and validation accuracy of ANN50 model.

Figure 12. Training and validation loss of ANN50 model.

Performance of all the ML models

Table 3. Evaluation of the efficiency of ML models.

Figure 13. Bar graph showing performance of machine learning models.

Discussion

Table 4. Comparison of the proposed optimal model with existing methods for the prediction of Successful Aging.

Conclusion

Declarations

Authors contribution

Data availability

Software availability

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated