Towards successful aging classification using machine learning algorithms

Jesuloluwa Zaccheus; Victoria Atogwe; Ayodele Oyejide; Ayodeji Olalekan Salau

doi:10.12688/f1000research.138608.2

Home Browse Towards successful aging classification using machine learning algorithms

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

Revised

Towards successful aging classification using machine learning algorithms

[version 2; peer review: 4 approved with reservations, 1 not approved]

Jesuloluwa Zaccheus¹, Victoria Atogwe¹, Ayodele Oyejide¹, Ayodeji Olalekan Salau ^2,3

PUBLISHED 03 Apr 2024

Author details Author details

¹ Biomedical Engineering, Afe Babalola University, Ado Ekiti, Ekiti, 23401, Nigeria
² Department of Electrical/Electronics and Computer Engineering, Afe Babalola University, Ado Ekiti, Ekiti, Nigeria
³ Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Tamil Nadu, Chennai, India

Jesuloluwa Zaccheus
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation

Victoria Atogwe
Roles: Data Curation, Investigation, Validation, Writing – Original Draft Preparation

Ayodele Oyejide
Roles: Formal Analysis, Methodology, Software, Validation

Ayodeji Olalekan Salau
Roles: Conceptualization, Data Curation, Investigation, Methodology, Supervision, Validation, Visualization, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background: Aging is a significant risk factor for a majority of chronic diseases and impairments. Increased medical costs brought about by the increasing aging population in the world increases the strain on families and communities. A positive and qualitative perspective on aging is successful aging (SA). Successful aging refers to the state of being free from diseases or impairments that hinder normal functioning, as observed from a biological perspective. This differs from typical aging, which is associated with a gradual decrease in both physical and cognitive capacities as individuals grow older.

Methods: In this study, the geriatric data acquired from the Afe Babalola University Multi-System Hospital, Ado-Ekiti was initially prepared, and three fundamental machine learning (ML) techniques such as artificial neural networks, support vector machines, and Naive Bayes were then constructed using the data from a sample of 2000 individuals. The Rowe and Kahn Model was used to determined that the dataset was SA based on factors such as the absence of fewer than or equivalent to two diseases, quality of life, nutrition, and capacity for everyday activities.

Results: According to the experimental findings, the predictive network, Artificial Neural Network (ANN) performed better than other models in predicting SA with a 100% accuracy, 100% sensitivity, and 100% precision.

Conclusions: The results show that ML techniques are useful in assisting social and health policymakers in their decisions on SA. The presented ANN-based method surpasses the other ML models when it comes to classifying people into SA and non-SA categories.

Keywords

Quality of Life, Aging, Machine Learning, ANN, Population

Corresponding author: Ayodeji Olalekan Salau

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2024 Zaccheus J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Zaccheus J, Atogwe V, Oyejide A and Salau AO. Towards successful aging classification using machine learning algorithms [version 2; peer review: 4 approved with reservations, 1 not approved]. F1000Research 2024, 12:1201 (https://doi.org/10.12688/f1000research.138608.2) First published: 25 Sep 2023, 12:1201 (https://doi.org/10.12688/f1000research.138608.1) Latest published: 03 Apr 2024, 12:1201 (https://doi.org/10.12688/f1000research.138608.2)

Revised Amendments from Version 1

The manuscript underwent several revisions based on the feedback from the reviewers. Figure 4 was removed to avoid conflicting with the results. A sentence was added to the abstract's background section to clarify the research's aim. The manuscript was reviewed for grammar errors and revised to address the reviewers' concerns regarding feature selection and model generalization. The sections now have clear sentences that explain the process. The machine learning models used and the validation approach are now justified. Furthermore, the data preprocessing and feature selection sections now contain a detailed explanation of how Rowe and Khan's model was applied to create the outcome variable.

To read any peer review reports and author responses for this article, follow the "read" links in the Open Peer Review table.

Introduction

The World Health Organization (WHO) predicts that by 2050, almost 1.6 billion people will be older adults, accounting for around 16% of the world’s total population.¹ Industrialized nations are currently experiencing significant sociodemographic and lifestyle changes, resulting in a shift from aging to a super-ageing population.²^,³ Nigeria, which has the largest economy in Africa, has the 19th-highest percentage of elderly citizens globally, and this percentage is expected to almost triple over the next two decades.⁴ However, the rise of older Nigerians is taking place amidst poverty, unresolved development issues, socioeconomic disparity, and a loss of traditional support and care for senior citizens.

Human life expectancy has increased globally due to advancements in medicine and social science, leading to the aging of populations becoming a problem for all nations.⁵^,⁶ While the extension of life expectancy is a significant scientific achievement, it has also resulted in increased expenses for social welfare and care for the elderly.⁷ This demographic shift has not only influenced disease patterns and increased chronic illnesses worldwide but has also posed socioeconomic challenges for governments and families.⁸^–¹⁰ In light of these developments, it is essential to consider the quality of life of the elderly and their preferences during this stage of life. Although everyone desires to live longer, it is more important for both individuals and society to focus on enhancing quality of life and reducing the burden of diseases in old age.¹¹

To address the challenges associated with population aging, the concept of successful aging (SA) has emerged. SA recognizes that the aging process is unique to each individual and encompasses various aspects across disciplines.¹²^,¹³ Although there is no formal definition for SA, it is widely accepted that it involves being free from chronic illnesses and having healthy physical and mental functioning.¹⁴^,¹⁵ Rowe and Kahn proposed an operational hypothesis for SA, which consists of three components: active engagement in life, absence of illness or impairment, and optimal physical and cognitive functioning.¹⁶^,¹⁷ This hypothesis is widely recognized in academic circles and emphasizes how elderly individuals adapt to the physical, spiritual, and social changes brought about by aging.¹⁸^,¹⁹

The concept of SA has evolved from a single-dimensional focus on the presence of disease or functional decline to a multidimensional perspective aligned with the World Health Organization’s definition of health, encompassing physical, mental, social, and spiritual well-being.²⁰ However, defining this complex and multidimensional phenomenon has proven challenging due to inherent ambiguity.²¹

The fact that non-genetic factors have a considerable impact on aging in addition to genetic influences is noteworthy.¹⁵ There haven’t been many long-term studies on SA, although prior research has usually concentrated on factors that affect SA.²²^,²³ Because of the co-dependence and complexity of the factors impacting SA, traditional statistical models are inappropriate.²⁴ Machine learning (ML) methods have been increasingly important in recent years for handling challenging, multidimensional, and nonlinear issues.²⁵ As a result, it is possible to develop an intelligent model to forecast whether SA will exist or not.

The use of machine learning to forecast and identify social aspects of aging has been studied in some detail. For instance, the authors in Ref. 26 used a sample of 983 to train five fundamental ML models (ANN, DT, SVM, NB, and K-NN) using one ensemble technique. The outcome of the prediction was achieved by implementing a method known as majority voting, which relies on the collective decision of the developed base models. The authors attained 93% accuracy, 92% specificity, and 87% sensitivity. Authors in Ref. 27 created questionnaires and fitness tests to gather the necessary information from the elderly population. The models used were gradient boosting decision trees, random forests, deep learning, and logistic regression. In a study involving 890 samples, a deep learning model demonstrated superior performance compared to other models. The deep learning model achieved an accuracy of 89.3%, a positive predictive value of 85.8%, and a specificity of 93.1%. They came to the conclusion that the deep learning model is excellent for SA maintenance prediction. The use of machine learning approaches to successfully predict aging in the elderly was discussed in Ref. 28. For the analysis, the researchers looked at the SA and non-SA data of 975 elderly persons. The Chi-square test at P > 0.05 was used to determine the factors that had the greatest impact on the SA. In this study, several algorithms such as Adaptive Boost, Random Forest, Artificial Neural Network, Support Vector Machine, and Naive Bayes were employed to develop prediction models. The performance of these models was evaluated using various metrics. The sensitivity, which measures the ability to correctly identify positive cases, was found to be 91%. The specificity, indicating the ability to correctly identify negative cases, was determined to be 98%. The overall accuracy of the models was 95%. The F-test, which assesses the model’s overall performance, yielded a value of 90%. Additionally, the area under the curve (AUC) test, which measures the model’s ability to distinguish between positive and negative cases, resulted in a score of 88.4%. Based on these evaluations, it was concluded that the Random Forest algorithm exhibited the best performance in predicting SA in elderly individuals (presumably referring to a specific condition or event).

This study aims to develop accurate models for predicting SA using data from the medical records of geriatric patients. Previous studies had some limitations, such as insufficient training data, class imbalance in the dataset, ineffective feature selection techniques, and poor accuracy. However, this study overcame these limitations by analyzing enough data and consulting with gerontologists. The study also implemented effective feature selection and dropout techniques to enhance the predictability of the models. The accuracy of the models was evaluated using artificial neural networks (ANN). In this study, three machine learning models were developed for SA prediction, which identified crucial factors for early SA prediction, such as sociodemographic, clinical, and lifestyle characteristics. The models were also used to investigate key determinants of SA progression. The subsequent sections describe the development process, results, and conclusion of the proposed SA prediction system.

Methodology

System architecture

Figure 1 depicts the main steps of the proposed system architecture. Data pre-processing, feature selection, model construction, performance measurements, and successfully classifying aging are among the phases. The original dataset, which is unstructured and not beneficial for the design of the model, is created during the data pre-processing stage by sorting out variables from the electronic medical records of elderly patients. Additionally, feature selection was utilized to separate the features that are redundant from the features that support accurate prediction in the system and to extract the significant variables that are relevant to successful aging. Thirdly, Support Vector Machine, Naive Bayes, and Artificial Neural Network were among the prediction models developed for this study. During the developmental process, particular hyperparameters for the SVM and NB models were carefully selected. Thereafter, the preprocessed data, was inputted into the model, using it to train the classification model. A better learning pattern for predicting successful aging is provided by the ANN model in combination with a dropout strategy. The combined technique offers increased classification accuracy, specificity, and sensitivity when compared to the other models. ReLu and sigmoid activation functions are used in the network’s hidden and output layers during the training and testing phases, respectively. Initially, three core classification algorithms, namely artificial neural network (ANN), support vector machine (SVM), and naive Bayes (NB) models, underwent training with the purpose of identifying whether an individual possessed the status of SA or non-SA. Then, the best hyperparameters and training ratios are used to increase the models’ predictive accuracy.

Figure 1. Proposed system architecture.

Study parameters

The proposed study will be a retrospective analysis focusing on the sociodemographic, clinical, behavioural, and psychological parameters of specific patients. It will access the medical records of patients aged 60 years or older.²⁹ Multiple variables have been identified based on preliminary data gathered. A machine learning model will be used to analyze the data, and the most useful features will be taken out and used as input parameters. In addition, the factors associated with successful aging will be determined through interactions with gerontology experts and analyses of relevant literature.

The sociodemographic factors that will be considered are age, sex, reading proficiency, marital status, occupation, and income level. The clinical factors include diseases such as hypertension, heart disease, kidney, liver, bone, and muscular disorders, as well as depression, eye and eyelid disorders, diabetes, and cancer. The behavioural and psychosocial factors include the ability to carry out activities of daily living (ADLs), life satisfaction, quality of life (QOL), a healthy lifestyle, interpersonal relationships, nutrition, physical activity, illness prevention strategies, and stress management. The sociodemographic and clinical information was taken from the medical records of elderly individuals.

The outcome variable will be divided into two classes: SA-related (coded 1) and non-SA-related (coded 0). To quantify SA, Rowe and Khan’s approach.¹⁶ SA which consists of maintaining good mental and physical function, ongoing involvement in life, and lack of disease and disease-related disability, will be used.

Data collection and pre-processing

This study involves a dataset of 2000 older individuals who were included in the Afe Babalola University Multi-System Hospital’s electronic medical record in Nigeria. The research committee of the hospital approved the study (Approval Number: AMSH/REC/AVA/133) with the requirement of adhering to all international guidelines and regulations. The data collected for this research was retrospective and did not involve interviews with participants. All data used in this research were anonymized and did not reveal the patient’s identity in any form. The dataset was collected from the electronic medical records of the Afe Babalola University Teaching Hospital between January 2019 and April 2023 in Nigeria and was pre-processed using various techniques to create optimal models. The dataset had some missing values, but these were not excluded as they could contain important information to influence predictions. Instead, the missing values were filled in using the mean of the corresponding feature in the dataset. Another issue was the imbalance of data class distribution. This was corrected to make the dataset more effective for machine learning algorithms.³⁰ After pre-processing and balancing, the dataset included 2000 records of both successfully aged and unsuccessfully aged persons. The data was then categorized as SA, based on the absence of two or fewer diseases, quality of life, nutrition, and the capacity for daily activities. After categorization, there were 1100 SA and 900 Non-SA instances in the dataset.

Feature selection

Feature selection is a crucial technique in data mining that helps to eliminate redundant and irrelevant features, thus reducing the dataset dimension and enhancing machine learning (ML) performance. It is an essential method that filters out duplicate and unrelated features from the dataset. By using statistical techniques to reduce the dataset dimension, feature selection can help to improve comprehension, avoid algorithm overfitting, boost processing power, and enhance mining performance.³¹^,³² Our study involved working with gerontologists to identify the features that contribute to successful aging in the elderly population. Using Rowe and Khan’s model for successful ageing, we sorted features based on their criteria for the absence of disability (met when adults have no disability and the number of chronic diseases is less than or equal to two), quality of life (based on continued engagement with life through employment, social, religious, and volunteering activities), and ability to carry out daily activities (physical and mental functions). After careful consideration, we determined that the most relevant features for defining successful aging in elderly people from the dataset are age, hypertension/CVD, renal illness, liver disease, neuromuscular disease, depression, eye disease, diabetes, cancer, ADLs, nutrition status, and QOL. To improve the precision, sensitivity, and accuracy of all produced models, we removed information such as marital status, occupation, educational level, gender, and hospital department from the dataset. Figure 2 shows the correlation between selected features in the extracted dataset, and a maximum correlation is observed between renal disease and the ability to carry out daily activities.

Figure 2. Correlation between features of the dataset.

Figure 3 denotes the variables in the dataset including the feature name, feature type, and concept codes. In the dataset, NMD represents all categories of neuromuscular diseases, while QOL refers to the quality of life of each individual, and lastly ADLs represent the ability of aged people to carry out daily activities regularly.

Figure 3. Cross-section of the input dataset.

Development of classification models

For SA prediction, three supervised learning approaches were used. The elderly individuals were classified as either SAs or non-SAs using the ANN, Naïve Bayes, and Support Vector Machine algorithms.

ANN: An artificial neural network (ANN) is a technique in machine learning that emulates the natural information processing mechanisms of the human brain. Numerous processing units (neurons) make up the neural network’s structure, and they communicate with one another via weights. A nonlinear mechanism in the neural network enables parallel processing, learning, and decision-making. An ANN modifies its weighted connections by using a variety of learning cases. One of the process’s outcomes is to modify the network’s settings so that it can be retrained in a different environment.³³

Naïve Bayes: This algorithm’s characteristics are assumed to be unrelated to one another. It is the Bayesian theorem generalized. NB calculates the likelihood that a data sample belongs to a specific class as part of its process because it is a probabilistic model.³⁴^,³⁵ NB is another name for independent or simple Bayes. The development of this technique is straightforward and does not necessitate difficult initial parameter estimates. As a result, it offers tremendous accuracy and speed when handling large datasets and can be applied to enormous amounts of data. However, this technique has problems with conditional class independence and access to data probabilities.³⁶^,³⁷

Support Vector Machine: According to Zach,³⁸ the Support Vector Machine (SVM) is a supervised learning technique utilized for classification and regression tasks. Its primary purpose is to identify the most effective classifier that can divide a given dataset into two distinct classes. When dealing with datasets that can be linearly separated, the SVM employs a linear function to determine a hyperplane that passes through the middle of the two classes. Notably, there exist multiple potential hyperplanes for separation in such cases. However, the SVM ensures the identification of an optimal function by maximizing the margin between the two classes. The margin, as described in Ref. 39, refers to the degree of separation between the classes, which is represented by the hyperplanes. In the field of machine learning, hyperparameters refer to the parameters used to regulate the learning process. They are predetermined values that are set before the model begins learning, with the aim of enhancing the learning outcomes. It is worth noting that not all hyperparameters carry equal significance. The effectiveness of the ML algorithm is more significantly impacted by some hyper-parameters than others. Table 1 lists the hyperparameters that we employed in our models.

Table 1. Model hyperparameters.

ML Algorithm	Hyperparameters
ANN	number of hidden layers = 2, optimizer = adam, dropout = 0.2
Naïve Bayes	classifier = GaussianNB
Support Vector Machine	kernel = rbf, random_state = 0

Design of classification models

The training and hyperparameters tuning processes are part of the classifier design process. The Support Vector Machine, Naive Bayes, and Artificial Neural Network are the classification models investigated. For all generated models, training was typically done on between 50% and 70% of the dataset. Table 1 lists the optimized hyperparameters used in various iterations of the Naive Bayes and Support Vector Machine models.

Initially, epochs of 100 and 250 were utilized to train the ANN model; however, later in the experiment, a batch size of 32 and 500 epochs was used for adequate training. To find the best model configuration, we trained our models on various train/test ratios (40/60, 50/50, 70/30, and 60/40). The 70/30 and 50/50 configurations produced the best accuracy with the least amount of over-fitting and were thus presented as the ideal models in this study. In addition, the Adam optimizer was used to train the model, and network hyperparameters such as the number of hidden layers and activation functions were modified. The ReLu activation was then used in all hidden layers, while the output layer for classification utilized a sigmoid activation function. The dropout technique was applied to reduce over-fitting and improve the model’s accuracy.⁴⁰ The proposed framework was developed in Python using the Anaconda IDE and built-in modules including Scikit-Learn, Keras, Tensorflow, Numpy, and Matplotlib. All experimentation was conducted on a Dell Precision WorkStation Computer equipped with 8GB RAM, 1TB HDD, and a 3.3GHz Core i5 processor. The test dataset was used to evaluate the prediction performance of our model, and 20% of the training dataset was used for validation.

Evaluation of the machine learning models

Cross-validation was not used in this investigation as the dataset had sufficient samples for training and testing. Instead, the holdout validation technique was employed. After training the SVM, NB, and ANN models, their performance was evaluated using the testing dataset. To assess the performance of the models, metrics such as accuracy, precision, sensitivity, specificity, and F1 score were used. Accuracy is an indicator of the model’s classification or validation accuracy. The confusion matrix was used to estimate the number of true positives, true negatives, false positives, and false negatives, which helped evaluate the model’s effectiveness. Table 2 provides the formulas required to calculate the measures mentioned above. Accuracy measures the proportion of accurate predictions to all other predictions. Precision, on the other hand, assesses the model’s accuracy in predicting positive outcomes. Sensitivity measures how accurately the model has identified the positive instances, while specificity gauges how many true negatives the model accurately detected. The F1 score is the weighted harmonic mean of precision and sensitivity. The dataset is divided into four main groups in each experiment: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).

Table 2. Performance metrics formula.

Performance metrics	Formulas
Accuracy	$\frac{TP + TN}{TP + TN + FP + FN}$
Precision	$\frac{TP}{TP + FP}$
Specificity	$\frac{TN}{TN + FP}$
Sensitivity	$\frac{TP}{TP + FN}$
F1 Score	$2 \times \frac{Precision \times Sensitivity}{Precision + Sensitivity}$

Results and discussion

The following section outlines the results obtained from the study. Firstly, the findings from experiments performed on 50% of the training data are presented, followed by the results of experiments conducted on 70% of the training data. The confusion matrices display the classification outcomes based on the available information. Each sample can belong to one of two classes - 0 for unsuccessful aging or 1 for successful ageing, as the dimensions of these matrices are 2*2, indicating two classes of data. The performance measurements are derived from the results of the confusion matrix calculations. To better understand how the datasets performed during the experiments, graphs and tables are provided and explained. Furthermore, the performances of all three models are compared and evaluated.

Experimental results for SVM, NB, and ANN models

Experimental results on 50% of training

bookmark://f5 In this experiment, the samples were divided into two halves – 1000 samples for testing and 1000 samples for training. Figures 4, 5, and 6 show the confusion matrix used to evaluate the performance of three ML algorithms – SVM, NB, and ANN. The performance measures were calculated based on this confusion matrix. The SVM model accurately predicted 550 cases of SA out of all cases, with no misdiagnosis, and also correctly predicted 183 non-SA cases. However, it misdiagnosed 267 non-SA cases. The model achieved a 73.3% accuracy and a 67.3% specificity. However the sensitivity of the model was 100%, and the F1 score was 57.8%, indicating that the model’s efficiency may decrease when faced with unseen data on a large scale.

Figure 4. Confusion matrix for SVM50 classifier.

Figure 5. Confusion matrix for NB50 classifier.

Figure 6. Confusion matrix for ANN50 classifier.

The NB model, on the other hand, accurately predicted 561 cases of SA and 371 non-SA cases, with only 68 non-SA cases mispredicted, achieving a precision of 84.5% and an accuracy of 93.2%. It outperformed the SVM model by a significant margin and is expected to be useful in predicting successful aging in a new dataset. The confusion matrix for the ANN model showed that it correctly predicted 550 SA cases and 450 non-SA cases. It achieved a 6.8% increase in accuracy as compared to the NB model and fared better due to the dropout technique.

Experimental results on 70% of training

In this experiment, 70% of the samples (1400 samples) were assigned for training and the remaining 30% (600 samples) for testing. The evaluation of the three machine learning methods (SVM, NB, and ANN) was done using the confusion matrix, which is shown in Figures 7, 8, and 9. The confusion matrix was used to calculate the performance measures. Figure 8 shows that the SVM model can predict 330 SA instances accurately, with no cases of SA misdiagnosis. The model also predicted 178 non-SA cases correctly but mispredicted 92 non-SA cases. The SVM model generated an accuracy of 84.7% and a precision of 65.9%. The NB model, as shown in Figure 8, predicted 332 SA instances accurately, with no cases of SA misdiagnosis. The model also predicted 224 non-SA cases accurately but mispredicted 44 non-SA cases. The NB model generated an accuracy of 92.7% and a precision of 83.6%. The NB model outperformed the SVM model by a significant margin and is expected to be effective in predicting successful aging in a new dataset. The ANN model’s confusion matrix, trained for 70% of the time, is shown in Figure 9. The model correctly predicted 330 and 270 instances of SA and non-SA, respectively. The ANN model with dropout surpassed the NB model in precision, with a percentage gain of 16.4%.

Figure 7. Confusion matrix for SVM70 classifier.

Figure 8. Confusion matrix for NB70 classifier.

Figure 9. Confusion matrix for ANN70 classifier.

ANN model accuracy and loss plots

In Figure 10, the accuracy plot of the ANN model is displayed. The model’s training accuracy increased from 0.8 to 0.9 between 50 and 100 epochs. At 300 epochs, the validation accuracy was 0.92, and it continued to grow from there. The reported average training accuracy was 100%. Additionally, the accuracy and loss plots of the ANN model using other train/test ratios were consistent. The graphic suggests that the ANN model is not overfitting, which indicates that the model can be used to predict aging successfully in the event of new data. Furthermore, Figure 11 shows the training and validation losses of the model. The loss function reached its minimum after 500 epochs, as seen in the graph. The low loss function of the ANN model has significance in providing accurate predictions for successful ageing, as supported by research.

Figure 10. Training and validation accuracy of ANN50 model.

Figure 11. Training and validation loss of ANN50 model.

Performance of all the ML models

The results of all the Machine Learning (ML) models created are listed in Table 3. As part of the precision criterion, the number of individuals whose classifier has assigned them to a positive class (SA) and who are positive is calculated. The proposed ANN model performs best according to this criterion, while the SVM model is the least efficient. The NB50 model has a precision value of 93.2%, which is 0.5% higher than that of the NB70 model. The sensitivity criteria have the best value for the ANN/SVM/NB algorithms. Sensitivity criteria are crucial for detecting every SA-positive person in the dataset. When it comes to identifying everyone who does not have SA, the ANN50/ANN70 model outperforms other ML models. This suggests that these models demonstrate the highest level of success in terms of specificity when they achieve a value of 100%. An algorithm is considered highly efficient when it can strike a favourable balance between sensitivity and specificity. The optimal design suggested in this research has established the best balance between these two requirements. The F1-score, which considers both sensitivity and precision parameters, has a value of 100% in ANN-based models, and it is higher than that of the NB50/NB70 models (91.6%/91.1%).

Table 3. Evaluation of the efficiency of ML models.

Model	Accuracy (%)	Precision (%)	Specificity (%)	Sensitivity (%)	F1-Score (%)
NB50	93.2	84.5	89.2	100	91.6
SVM50	73.3	40.7	67.3	100	57.8
ANN50	100	100	100	100	100
NB70	92.7	83.6	88.3	100	91.1
SVM70	84.7	65.9	78.2	100	79.5
ANN70	100	100	100	100	100

Accuracy is the most fundamental and direct indicator of a classifier’s performance, and it typically manifests itself in the accurate detection of samples. Additionally, the SVM model has the lowest classification accuracy (73.3%), and the best classifier among the other models is the ANN-based approach, which has a value of 100%. Figure 12 is a bar chart that contrasts the machine learning algorithms designed according to their F1 score, specificity, sensitivity, accuracy, and precision.

Figure 12. Bar graph showing performance of machine learning models.

Discussion

This study aimed to assess subjective well-being by examining three key dimensions: physiological, cognitive psychological, and social functioning. This approach aligns with Rowe and Kahn’s theory.¹⁸ To determine whether a person has successful aging (SA) or not, the study aimed to create prediction models that would use clinical and lifestyle factors as inputs. The findings offer important new perspectives for determining SA likelihood. According to the main hypothesis, the proposed machine learning (ML) technique produced a potent SA status classifier. The study introduced a novel approach that uses three key ML techniques to forecast SA. A prediction model can be created using a variety of ML methods, but the numerous fundamental model assumptions in the use of today’s ML approaches preclude successful application. The optimal method for dealing with a dataset that is both highly variable and noisy, such as the case with successful aging data, remains uncertain. This is because it is frequently challenging to verify fundamental assumptions. Furthermore, no single ML technique yields reliable prediction outcomes. Scientists and researchers are continually seeking well-trained machine learning models that demonstrate accurate and reliable performance consistently. However, in reality, only a few biased models can occasionally be produced, therefore training model outputs are not always flawless. Table 4 displays a collection of studies that have explored the utilization of machine learning techniques to forecast and recognize instances of successful ageing through diverse approaches. Herein, these studies were compared with the proposed approach. In this study, an Artificial Neural Network (ANN)-based technique was introduced to determine whether the tested individuals fall into the SA category or not. The data were preprocessed in the first phase to make them appropriate for use in data mining analysis. Then, ANN was introduced, which is better at predicting the SA than previously developed ML models like SVM and NB. The results of the experiment demonstrated that by employing a holdout validation approach, the predictive system achieved outstanding performance in forecasting SA, achieving perfect precision, specificity, accuracy, and F1-score, all reaching a flawless 100% level.

Table 4. Comparison of the proposed optimal model with existing methods for the prediction of Successful Aging.

Authors	Purpose	Classifier(s)	Accuracy (%)
⁴¹	Prediction of mental impairment in aged people	Ensemble classifier	87.4
²⁷	Prediction of SA maintenance	Deep learning	89.3
²⁶	Prediction of SA	Ensemble KNN	89.6
²⁸	Prediction of SA	Random Forest	95
Implemented work	Prediction of SA	Artificial Neural Network	100

According to current research, machine learning (ML) has shown consistent accuracy in predicting subjective assessment (SA) in older individuals. The computed metrics demonstrate that ML models, trained using specific attributes, accurately forecasted SA. Using past theoretical and empirical studies in feature selection may improve prediction accuracy by reducing the number of irrelevant or superfluous variables in the model. Dropout regularization was also used to prevent overfitting, which improved the accuracy of the ANN model and extended the scope of its application. Although our study performed well in estimating SA in older persons, there were a few potential weaknesses that need to be addressed. Firstly, a dataset from a single database was retrospectively examined for this study, which could affect data accuracy, completeness, and generalizability. To increase data homogeneity in the pre-processing stage, we established the standard selection of each variable through conversations with gerontology experts. This method made it easier for us to extract the most useful raw data from the hospital’s electronic medical record. Secondly, this study only applied three significant ML algorithms using a 2000-item sample dataset. The accuracy and generalizability of our models would be enhanced as we assess a wider range of machine learning methods using larger, multicenter, and prospective datasets. Thirdly, the outcomes of the current investigation should be confirmed using an external validation approach. Lastly, the link between the predictor and outcome factors was not examined in this study. Future research should explore a series of long-term factors associated with successful aging. Additionally, the suggested model can also be improved to provide scalable prediction models for successful ageing.

Conclusion

The study aimed to assess successful aging (SA) by considering three aspects of this health state: physiological, cognitive psychological, and social function. The SA model proposed by Rowe and Kahn was used as the basis for the study, which employed machine learning techniques to predict the aging process accurately. The study found that using machine learning improved the prediction of SA, indicating that it is a promising approach. On average, the ML models used in the study achieved an accuracy, specificity, and F1 score of >90%, >87%, and >86%, respectively. Additionally, the created ANN models achieved 100% accuracy and sensitivity. These findings demonstrate that machine learning can significantly enhance the prediction of successful ageing. The results of this study could be of great benefit to geriatricians and senior nurses, as it will improve their ability to provide better assistance and care to the elderly. Moreover, the prediction models we have created can be used by healthcare executives and policymakers as a reliable and adaptable tool to improve geriatric outcomes.

Declarations

Authors contribution

JEZ – Conceptualization, Methodology, Software, Result Analysis and Writing (Original draft preparation, final review, and editing, VAO – Software, Data Curation, and Original Draft, AJO – Investigation, Visualization, Writing (final review and editing), ASO – Software, Data Curation, Methodology, Writing (final review and editing). All authors read and approved the manuscript.

Data availability

Zenodo: Successful Aging Dataset for Elderly Patients 10.5281/zenodo.8132494.⁴²

This project contains the following underlying data:

• SAData.xlsx. (anonymized records of elderly individuals for the prediction of successful aging. 2000 records in total.).

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Software availability

• Software code available from: https://zenodo.org/record/8184228

License: software are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

References

1. Lin Y-H, Chen Y-C, Tseng Y-C, et al.: Physical activity and successful aging among middle-aged and older adults: a systematic review and meta-analysis of cohort studies. Aging (Albany NY). 2020; 12(9): 7704–7716. PubMed Abstract | Publisher Full Text | Free Full Text
2. Chandraa CE, Abdullaha S: Forecasting mortality trend of Indonesian old aged population with bayesian method. Int. J. Adv. Sci. Eng. Inf. Technol. 2022; 12(2): 580–588. Publisher Full Text
3. Seyda Seydel G, Kucukoglu O, Altinbasv A, et al.: Economic growth leads to increase of obesity and associated hepatocellular carcinoma in developing countries. Ann. Hepatol. 2016; 15(5): 662–672. PubMed Abstract | Publisher Full Text
4. Mbam KC, Halvorsen CJ, Okoye UO: Aging in Nigeria: A Growing Population of Older Adults Requires the Implementation of National Aging Policies. Gerontologist. 2022 Oct 19; 62(9): 1243–1250. PubMed Abstract | Publisher Full Text
5. Wang Q, Li L: The effects of population aging, life expectancy, unemployment rate, population density, per capita GDP, urbanization on per capita carbon emissions. Sustain Product Consum. 2021; 28: 760–774. Publisher Full Text
6. Kiziltan M: The Effects of Population Aging and Life Expectancy on Economic Growth: The Case of Emerging Market Economies.Bayar Y, editor. Handbook of Research on Economic and Social Impacts of Population Aging. IGI Global; 2021; pp. 97–118. ch007. Publisher Full Text
7. Seong MH, Shin E, Sok S: Successful aging perception in middle-aged korean men: aq methodology approach. Int. J. Environ. Res. Public Health. 2021; 18(6): 3095. PubMed Abstract | Publisher Full Text | Free Full Text
8. Lin L, Wang HH, Lu C, et al.: Adverse childhood experiences and subsequent chronic diseases among middle-aged or older adults in China and associations with demographic and socioeconomic characteristics. JAMA Netw. Open. 2021; 4(10): e2130143–e2130143. PubMed Abstract | Publisher Full Text | Free Full Text
9. Ferrucci L, Gonzalez-Freire M, Fabbri E, et al.: Measuring biological aging in humans: a quest. Aging Cell. 2020; 19(2): e13080. PubMed Abstract | Publisher Full Text
10. Lin E, Lin C-H, Lane H-Y: Prediction of functional outcomes of schizophrenia with genetic biomarkers using a bagging ensemble machine learning method with feature selection. Sci. Rep. 2021; 11(1): 1–8.
11. Nosraty L, Pulkki J, Raitanen J, et al.: Successful aging as a predictor of long-term care among oldest old: the vitality 90+ study. J. Appl. Gerontol. 2019; 38(4): 553–571. PubMed Abstract | Publisher Full Text
12. Mendoza-Nunez VM, Pulido-Castillo G, Correa-Munoz E, et al.: Effect of a community gerontology program on the control of metabolicsyndrome in mexican older adults. Healthcare. 2022; 10(3): 466. PubMed Abstract | Publisher Full Text | Free Full Text
13. Teater B, Chonody JM: What attributes of successful aging are important to older adults? The development of a multidimensional definition of successful aging. Soc. Work Health Care. 2020; 59(3): 161–179. PubMed Abstract | Publisher Full Text
14. Bowling A: Aspirations for older age in the 21st century: What is successful aging? Int. J. Aging Hum. Dev. 2007; 64(3): 263–297. PubMed Abstract | Publisher Full Text
15. Bosnes I, Nordahl HM, Stordal E, et al.: Lifestyle predictors of successful aging: a 20-year prospective HUNT study. PLoS One. 2019; 14(7): e0219200. PubMed Abstract | Publisher Full Text | Free Full Text
16. Rowe JW, Kahn RL: Successful aging. Gerontologist. 1997; 37(4): 433–440. Publisher Full Text
17. Shafiee M, Hazrati M, Motalebi SA, et al.: Can healthy life style predict successful aging among Iranian older adults? Med. J. Islam Repub. Iran. 2020; 34: 139.
18. Chiao CY, Hsiao CY: Comparison of personality traits and successful aging in older Taiwanese. Geriatr. Gerontol Int. 2017; 17(11): 2239–2246. PubMed Abstract | Publisher Full Text
19. Dorji L, Jullamate P, Subgranon R, et al.: Predicting factors of successful aging among community dwelling older adults in Thimphu, Bhutan. Bangkok Med. J. 2019; 15(1): 38–43. Publisher Full Text
20. Ng TP, Broekman BF, Niti M, et al.: Determinants of successful aging using a multidimensional definition among Chinese elderly in Singapore. Am. J. Geriatr. Psychiatry. 2009; 17(5): 407–416. PubMed Abstract | Publisher Full Text
21. Anton SD, Woods AJ, Ashizawa T, et al.: Successful aging: advancing the science of physical independence in older adults. Ageing Res. Rev. 2015; 24: 304–327. PubMed Abstract | Publisher Full Text | Free Full Text
22. Liu H, Byles JE, Xu X, et al.: Evaluation of successful aging among older people in China: results from China health and retirement longitudinal study. Geriatr. Gerontol. Int. 2017; 17(8): 1183–1190. PubMed Abstract | Publisher Full Text
23. Canêdo AC, Lopes CS, Lourenço RA: Prevalence of and factors associated with successful aging in Brazilian older adults: frailty in Brazilian older people study (FIBRA RJ). Geriatr. Gerontol. Int. 2018; 18(8): 1280–1285. PubMed Abstract | Publisher Full Text
24. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text
25. Raza K: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-Healthcare Monitoring Systems. Elsevier; 2019; pp. 179–196. Publisher Full Text
26. Asghari Varzaneh Z, Shanbehzadeh M, Kazemi-Arpanahi H: Prediction of successful aging using ensemble machine learning algorithms. BMC Med. Inform. Decis. Mak. 2022; 22: 258. PubMed Abstract | Publisher Full Text | Free Full Text
27. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text
28. Maryam A, Raoof N, Somayeh N: Designing a Predictive Model for Successful Aging among the Elderly Using Machine Learning Techniques.2022. Publisher Full Text
29. Nagarajan NR, Teixeira AA, Silva ST: Ageing population: identifying the determinants of ageing in the least developed countries. Popul. Res. Policy Rev. 2021; 40(2): 187–210. Publisher Full Text
30. Olson DL: Data set balancing. In: Chinese Academy of Sciences Symposium on Data Mining and Knowledge Management. Berlin, Heidelberg: Springer; 2004 Jul 12; 71–80.
31. Chandrashekar G, Sahin F: A survey on feature selection methods. Comput. Electr. Eng. 2014; 40(1): 16–28. Publisher Full Text
32. Li J, Cheng K, Wang S, et al.: Feature selection: a data perspective. ACM Comput. Surv. 2017; 50(6): 1–45. Publisher Full Text
33. Guan Z-J, Li R, Jiang J-T, et al.: Data mining and design of electromagnetic properties of Co/FeSi filled coatings based on genetic algorithms optimized artificial neural networks (GA-ANN). Compos. B Eng. 2021; 226: 109383. Publisher Full Text
34. Sinaga LM, Suwilo S: Analysis of classification and Naïve Bayes algorithm k-nearest neighbor in data mining. IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020; Vol. 725(1): p. 012106.
35. Sembiring M, Tambunan R: Analysis of graduation prediction on time based on student academic performance using the Naïve Bayes Algorithm with data mining implementation (Case study: Department of Industrial Engineering USU). IOP Conference Series: Materials Science and Engineering: 2021. IOP Publishing; 2021; p. 012069.
36. Gopinath C, Manikanta J: Performance Analysis Based on Data Mining Technique in Predicting the Diabetic Disease-Decision tree and Naïve Bayes. 2019 1st International Conference on Advances in Information Technology (ICAIT): 2019. IEEE; 2019; pp. 525–528.
37. Prasetya R, Ridwan A: Data mining application on weather prediction using classification tree, naïve bayes and K-nearest neighbor algorithm with model testing of supervised learning probabilistic brier score, confusion matrix and ROC. J. Appl. Commun. Inf. Technol. 2020; 4(2): 25–33. Publisher Full Text
38. Khazaei S, Najafi-GhOBADI S, Ramezani-Doroh V: Construction data mining methods in the prediction of death in hemodialysis patients using support vector machine, neural network, logistic regression and decision tree. J. Prev. Med. Hyg. 2021; 62(1): E222–E230. PubMed Abstract | Publisher Full Text
39. Chidambaram S, Srinivasagan K: Performance evaluation of support vector machine classification approaches in data mining. Clust. Comput. 2019; 22(1): 189–196. Publisher Full Text
40. Srivastava N, Hinton G, Krizhevsky A, et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014; 15(1): 1929–1958. Publisher Full Text
41. Byeon H: Exploring factors for predicting anxiety disorders of the elderly living alone in South Korea using interpretable machine learning: a population-based study. Int. J. Environ. Res. Public Health. 2021; 18(14): 7625. PubMed Abstract | Publisher Full Text | Free Full Text
42. Zaccheus J: Successful Aging Dataset for Elderly Patients.2023. Publisher Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 25 Sep 2023

Author details Author details

¹ Biomedical Engineering, Afe Babalola University, Ado Ekiti, Ekiti, 23401, Nigeria
² Department of Electrical/Electronics and Computer Engineering, Afe Babalola University, Ado Ekiti, Ekiti, Nigeria
³ Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Tamil Nadu, Chennai, India

Jesuloluwa Zaccheus
Roles: Conceptualization, Data Curation, Formal Analysis, Methodology, Software, Visualization, Writing – Original Draft Preparation

Victoria Atogwe
Roles: Data Curation, Investigation, Validation, Writing – Original Draft Preparation

Ayodele Oyejide
Roles: Formal Analysis, Methodology, Software, Validation

Ayodeji Olalekan Salau
Roles: Conceptualization, Data Curation, Investigation, Methodology, Supervision, Validation, Visualization, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (2)

version 2

Revised

Published: 03 Apr 2024, 12:1201

https://doi.org/10.12688/f1000research.138608.2

version 1

Published: 25 Sep 2023, 12:1201

https://doi.org/10.12688/f1000research.138608.1

Copyright

© 2024 Zaccheus J et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Zaccheus J, Atogwe V, Oyejide A and Salau AO. Towards successful aging classification using machine learning algorithms [version 2; peer review: 4 approved with reservations, 1 not approved]. F1000Research 2024, 12:1201 (https://doi.org/10.12688/f1000research.138608.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 2

VERSION 2

PUBLISHED 03 Apr 2024

Revised

Views

3

Reviewer Report 06 Nov 2024

Larissa Pruner Marques, Oswaldo Cruz Foundation, Rio de Janeiro,, Rio de Janeiro, Brazil

Approved with Reservations

https://doi.org/10.5256/f1000research.163894.r326282

This study investigates the application of machine learning (ML) techniques to classify individuals into successful aging (SA) and non-successful aging categories, emphasizing the importance of a positive perspective on aging. By utilizing geriatric data from a hospital in Nigeria, the ... Continue reading

This study investigates the application of machine learning (ML) techniques to classify individuals into successful aging (SA) and non-successful aging categories, emphasizing the importance of a positive perspective on aging. By utilizing geriatric data from a hospital in Nigeria, the researchers employed three ML methods, including artificial neural networks (ANN), support vector machines, and Naive Bayes, to analyze a sample of 2,000 individuals.
Here’s a few topics for improvements:

Theoretical Definition of Successful Aging: The authors define successful aging (SA) in the abstract as being free from diseases, relying on a theoretical framework that may overly constrain the concept. This perspective does not adequately reflect the adaptability and multifactorial nature of aging, which is more comprehensively discussed in the introduction. By presenting such a limiting definition upfront, the article may mislead readers regarding the broader understanding of successful aging.
Potential Ageist Terminology: The use of terms like "geriatric data" and "elderly" throughout the article could inadvertently promote ageism. It is important for the authors to revisit and reconsider these terms, opting for language that is more respectful and inclusive of older adults, thereby fostering a more positive discourse around aging and medical terms.
Citing References: In the sixth paragraph of the introduction, the authors reference previous works using the term "Ref. 26" instead of the authors' surnames. This could disrupt the flow of reading and diminish the credibility of the sources. The authors should ensure proper citation by using the authors' names for clarity and engagement.
Introduction Structure: The final paragraph of the introduction begins with the study's objective but then shifts to discussing methodology. While providing justification for the research is important, this paragraph would benefit from a more cohesive structure that concludes with the research objective rather than detailing the methodology. This adjustment would enhance clarity and maintain focus.
Clarity on Variables for Successful Aging: A significant limitation of the article is the lack of a clear framework for defining and measuring the variables associated with successful aging. The authors should specify how they assess nutritional status, daily living activities, and quality of life, including the cutoff points for classification. This clarity is essential for understanding the model's reproducibility, as emphasized by the authors in their conclusion.
Table Placement: The inclusion of a table in the discussion section is unconventional and may confuse readers. The authors might consider repositioning the table as supplementary material to maintain the integrity of the discussion.
Data Availability: I commend the authors for their commitment to transparency and openness by making the data available on Zenodo. This is a commendable practice that supports the reproducibility and integrity of research findings.

By addressing these critiques, the authors can enhance the clarity, inclusivity, and overall quality of their article.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: public health, multimorbidity, quality of life, aging

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

5

Reviewer Report 18 Oct 2024

Tagne Poupi Theodore Armand, Inje University, Gimhae, South Korea

Approved with Reservations

https://doi.org/10.5256/f1000research.163894.r326281

Your Report

The research paper presented by Zaccheus et al. uses three state-of-the-art machine learning models to classify successful aging (SA). The authors used 2000 sample data from individuals from Afe Babalola University Multi-System Hospital in this ... Continue reading

Your Report

The research paper presented by Zaccheus et al. uses three state-of-the-art machine learning models to classify successful aging (SA). The authors used 2000 sample data from individuals from Afe Babalola University Multi-System Hospital in this research. The results are outstanding, with a report of 100% accuracy, 100% sensitivity, and 100% precision, but it seems unrealistic. Though the method employed is not new, the case study in SA can be significant for science, with positive outcomes for patients considering the size of the aging population worldwide.

Here are some points that can improve the manuscript.

1—In the data collection process, the authors mentioned the non-exclusion of missing values and said they used the mean imputation method. Dealing with medical data requires more explanation and sometimes expert intervention for missing data handling processes, which must be revised. It remains unclear whether the input mean values did not bias the dataset.

2- It is unclear why the author used two data split configurations of 70/30 and 50/50 for training. They further indicated that 20% was used for validation; in the experimental detail, they mention 30% for testing. The data split into training-testing and validation remains confusing.

3—The feature selection process is not sufficiently described. The authors mentioned consulting gerontologists but said they selected some of the most relevant features. Did you apply any additional techniques to determine future importance? If yes, describe it.

4—Though an external validation set may be difficult to obtain, it can further confirm the results.

5- There is no clear justification for the choice of machine learning algorithms.

6- Is there any clear conclusion about using two proportions of datasets?

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 25 Sep 2023

Views

9

Reviewer Report 06 Mar 2024

Peter Fedichev, Gero PTE. LTD, Singapore, Singapore

Approved with Reservations

https://doi.org/10.5256/f1000research.151819.r246031

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection ... Continue reading

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection and accuracy metrics.

Although technically proficient, the small sample size poses limitations on model complexity and validation robustness. To enhance validation, I recommend cross-validation with external cohorts (e.g., NHANES, UK Biobank), ensuring the model's features are universally applicable.

Additionally, the manuscript employs a linear classifier for SA, based on continuous predictors (log odds ratio). It's crucial to evaluate how this correlates with established aging measures like the frailty index (FI) or biological age (BA). A comparison with log-linear classifiers using FI or BA, if data permits, could enrich the manuscript.

I understand that integrating these comparisons might be demanding. I would leave to the authors if they would want to add new results or leave it for a potential future exploration.

However, I believe that a fair discussion of the sample size, model limitations and the relationship between the SA and FI and BA, all supported by recent literature, is essential. Notable references include Kenneth Rockwood's work on FI (Ref [1]) and recent reviews on BA by the biomarkers of aging consortium (Ref [2]).

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. Blodgett J, Theou O, Kirkland S, Andreou P, et al.: Frailty in NHANES: Comparing the frailty index and phenotype.Arch Gerontol Geriatr. 2015; 60 (3): 464-70 PubMed Abstract | Publisher Full Text
2. Moqri M, Herzog C, Poganik JR, Ying K, et al.: Validation of biomarkers of aging.Nat Med. 2024; 30 (2): 360-372 PubMed Abstract | Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: theories of aging, AI/ML in biology and drug discovery, systems biology, biomarkers of aging, drug discovery against aging and age-related diseases

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

12

Reviewer Report 26 Feb 2024

Jiao Yu, Yale University, Yale, USA

Approved with Reservations

https://doi.org/10.5256/f1000research.151819.r246029

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of ... Continue reading

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of various machine learning models for successful aging prediction. Among these models, the artificial neural network (ANN) demonstrates the best performance. While the study's main findings highlight the potential of machine learning in predicting successful aging, certain limitations require consideration.

Abstract: Include a clear sentence in the background section stating the specific problem your research aims to address.

According to the Rowe and Kahn’s (1997) framework, successful aging was frequently operationalized by the absence of major diseases, lack of activity of daily living (ADL) disabilities, high levels of physical and cognitive functioning, and active social engagement. It is unclear how the Rowe and Khan approach was applied to create the outcome variable for successful aging classification.

It is unclear what those limitations were in previous works. Did the previous studies fail to address the limitations mentioned in this analysis? such that they failed to select significant aging-related features?

Concerns also arise regarding the justification of the sample size. The authors need to provide a rationale for why the chosen sample size is deemed sufficient for the study's objectives.

It is unclear as to how the feature selection was conducted. It is unclear whether the machine learning models utilized in this study also served as feature selection algorithms or if an alternative approach was adopted.

I am also concerned that the high accuracy of ANN may result from overfitting, particularly in the absence of independent validation data. This may also be due to the fact that predictors are highly correlated with your outcome. You may consider a correlation analysis between predictor variables and the outcome variable. If high correlations exist, discuss their impact on model estimation.

While the ANN method shows the best performance in this dataset, its generalizability remains uncertain. How these methods can be applied in practice needs further development in the discussion.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: aging, health disparities

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

9

Reviewer Report 19 Feb 2024

Brian H Chen, UC San Diego, San Diego, California, USA

Not Approved

https://doi.org/10.5256/f1000research.151819.r235047

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise ... Continue reading

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise since the authors seem to be using input variables that are part of the definition of the outcome. Furthermore, there are insufficient details on provided on the methodology.

Here are my suggestions:

1) The use of Rowe and Kahn definition of "Successful Aging" is not widely accepted. That said, it was not clear how the principles posed by Rowe and Kahn were applied to create the outcome variable.

2) The authors should make clearer why they selected the input variables for SA classification. I imagine the presence of some of these input variables (e.g., diseases, ADLs, QOL) are part of the definition of "SA." Are the authors merely trying to recreate Rowe & Kahn's model with the addition of other variables?

3) The arbitrary selection of machine learning algorithms needs further justification. The SVM and Naive Bayes approaches, I would predict a priori, to perform worse than any neural network.

4) The authors were unclear as to how the feature selection was actually conducted. We will need more detail than those variables being "most pertinent to SA."

5) The authors did not validate their models using an independent sample, which explains their overly optimistic model performance.

6) The confusion matrix in Figure 4 is incorrect.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Aging biomarkers and prediction modeling using machine learning.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 25 Sep 2023

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2	3	4	5
Version 2 (revision) 03 Apr 24				read	read
Version 1 25 Sep 23	read	read	read

Brian H Chen, UC San Diego, San Diego, USA
Jiao Yu, Yale University, Yale, USA
Peter Fedichev, Gero PTE. LTD, Singapore, Singapore
Tagne Poupi Theodore Armand, Inje University, Gimhae, South Korea
Larissa Pruner Marques, Oswaldo Cruz Foundation, Rio de Janeiro,, Rio de Janeiro, Brazil

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

3 Views

06 Nov 2024 | for Version 2

Larissa Pruner Marques, Oswaldo Cruz Foundation, Rio de Janeiro,, Rio de Janeiro, Brazil

3 Views Cite this report Responses(0)

Approved With Reservations

This study investigates the application of machine learning (ML) techniques to classify individuals into successful aging (SA) and non-successful aging categories, emphasizing the importance of a positive perspective on aging. By utilizing geriatric data from a hospital in Nigeria, the researchers employed three ML methods, including artificial neural networks (ANN), support vector machines, and Naive Bayes, to analyze a sample of 2,000 individuals.
Here’s a few topics for improvements:

Theoretical Definition of Successful Aging: The authors define successful aging (SA) in the abstract as being free from diseases, relying on a theoretical framework that may overly constrain the concept. This perspective does not adequately reflect the adaptability and multifactorial nature of aging, which is more comprehensively discussed in the introduction. By presenting such a limiting definition upfront, the article may mislead readers regarding the broader understanding of successful aging.
Potential Ageist Terminology: The use of terms like "geriatric data" and "elderly" throughout the article could inadvertently promote ageism. It is important for the authors to revisit and reconsider these terms, opting for language that is more respectful and inclusive of older adults, thereby fostering a more positive discourse around aging and medical terms.
Citing References: In the sixth paragraph of the introduction, the authors reference previous works using the term "Ref. 26" instead of the authors' surnames. This could disrupt the flow of reading and diminish the credibility of the sources. The authors should ensure proper citation by using the authors' names for clarity and engagement.
Introduction Structure: The final paragraph of the introduction begins with the study's objective but then shifts to discussing methodology. While providing justification for the research is important, this paragraph would benefit from a more cohesive structure that concludes with the research objective rather than detailing the methodology. This adjustment would enhance clarity and maintain focus.
Clarity on Variables for Successful Aging: A significant limitation of the article is the lack of a clear framework for defining and measuring the variables associated with successful aging. The authors should specify how they assess nutritional status, daily living activities, and quality of life, including the cutoff points for classification. This clarity is essential for understanding the model's reproducibility, as emphasized by the authors in their conclusion.
Table Placement: The inclusion of a table in the discussion section is unconventional and may confuse readers. The authors might consider repositioning the table as supplementary material to maintain the integrity of the discussion.
Data Availability: I commend the authors for their commitment to transparency and openness by making the data available on Zenodo. This is a commendable practice that supports the reproducibility and integrity of research findings.

By addressing these critiques, the authors can enhance the clarity, inclusivity, and overall quality of their article.

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

public health, multimorbidity, quality of life, aging

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

5 Views

18 Oct 2024 | for Version 2

Tagne Poupi Theodore Armand, Inje University, Gimhae, South Korea

5 Views Cite this report Responses(0)

Approved With Reservations

Your Report

The research paper presented by Zaccheus et al. uses three state-of-the-art machine learning models to classify successful aging (SA). The authors used 2000 sample data from individuals from Afe Babalola University Multi-System Hospital in this research. The results are outstanding, with a report of 100% accuracy, 100% sensitivity, and 100% precision, but it seems unrealistic. Though the method employed is not new, the case study in SA can be significant for science, with positive outcomes for patients considering the size of the aging population worldwide.

Here are some points that can improve the manuscript.

1—In the data collection process, the authors mentioned the non-exclusion of missing values and said they used the mean imputation method. Dealing with medical data requires more explanation and sometimes expert intervention for missing data handling processes, which must be revised. It remains unclear whether the input mean values did not bias the dataset.

2- It is unclear why the author used two data split configurations of 70/30 and 50/50 for training. They further indicated that 20% was used for validation; in the experimental detail, they mention 30% for testing. The data split into training-testing and validation remains confusing.

3—The feature selection process is not sufficiently described. The authors mentioned consulting gerontologists but said they selected some of the most relevant features. Did you apply any additional techniques to determine future importance? If yes, describe it.

4—Though an external validation set may be difficult to obtain, it can further confirm the results.

5- There is no clear justification for the choice of machine learning algorithms.

6- Is there any clear conclusion about using two proportions of datasets?

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

9 Views

06 Mar 2024 | for Version 1

Peter Fedichev, Gero PTE. LTD, Singapore, Singapore

9 Views Cite this report Responses(0)

Approved With Reservations

The manuscript outlines a study conducted on a small cohort from Afe Babalola University Multi-System Hospital’s electronic records in Nigeria (January 2019 - April 2023), detailing a Machine Learning (ML) pipeline to assess SA with novel insights on feature selection and accuracy metrics.

Although technically proficient, the small sample size poses limitations on model complexity and validation robustness. To enhance validation, I recommend cross-validation with external cohorts (e.g., NHANES, UK Biobank), ensuring the model's features are universally applicable.

Additionally, the manuscript employs a linear classifier for SA, based on continuous predictors (log odds ratio). It's crucial to evaluate how this correlates with established aging measures like the frailty index (FI) or biological age (BA). A comparison with log-linear classifiers using FI or BA, if data permits, could enrich the manuscript.

I understand that integrating these comparisons might be demanding. I would leave to the authors if they would want to add new results or leave it for a potential future exploration.

However, I believe that a fair discussion of the sample size, model limitations and the relationship between the SA and FI and BA, all supported by recent literature, is essential. Notable references include Kenneth Rockwood's work on FI (Ref [1]) and recent reviews on BA by the biomarkers of aging consortium (Ref [2]).

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

References

1. Blodgett J, Theou O, Kirkland S, Andreou P, et al.: Frailty in NHANES: Comparing the frailty index and phenotype.Arch Gerontol Geriatr. 2015; 60 (3): 464-70 PubMed Abstract | Publisher Full Text
2. Moqri M, Herzog C, Poganik JR, Ying K, et al.: Validation of biomarkers of aging.Nat Med. 2024; 30 (2): 360-372 PubMed Abstract | Publisher Full Text

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

theories of aging, AI/ML in biology and drug discovery, systems biology, biomarkers of aging, drug discovery against aging and age-related diseases

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

12 Views

26 Feb 2024 | for Version 1

Jiao Yu, Yale University, Yale, USA

12 Views Cite this report Responses(0)

Approved With Reservations

The research investigates successful aging classification utilizing three machine learning algorithms, focusing on predicting successful aging through an analysis of a sample of 2000 individuals from older individuals in Nigeria. A key focus of the study was the assessment of various machine learning models for successful aging prediction. Among these models, the artificial neural network (ANN) demonstrates the best performance. While the study's main findings highlight the potential of machine learning in predicting successful aging, certain limitations require consideration.

Abstract: Include a clear sentence in the background section stating the specific problem your research aims to address.

According to the Rowe and Kahn’s (1997) framework, successful aging was frequently operationalized by the absence of major diseases, lack of activity of daily living (ADL) disabilities, high levels of physical and cognitive functioning, and active social engagement. It is unclear how the Rowe and Khan approach was applied to create the outcome variable for successful aging classification.

It is unclear what those limitations were in previous works. Did the previous studies fail to address the limitations mentioned in this analysis? such that they failed to select significant aging-related features?

Concerns also arise regarding the justification of the sample size. The authors need to provide a rationale for why the chosen sample size is deemed sufficient for the study's objectives.

It is unclear as to how the feature selection was conducted. It is unclear whether the machine learning models utilized in this study also served as feature selection algorithms or if an alternative approach was adopted.

I am also concerned that the high accuracy of ANN may result from overfitting, particularly in the absence of independent validation data. This may also be due to the fact that predictors are highly correlated with your outcome. You may consider a correlation analysis between predictor variables and the outcome variable. If high correlations exist, discuss their impact on model estimation.

While the ANN method shows the best performance in this dataset, its generalizability remains uncertain. How these methods can be applied in practice needs further development in the discussion.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

Yes
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

aging, health disparities

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

9 Views

19 Feb 2024 | for Version 1

Brian H Chen, UC San Diego, San Diego, California, USA

9 Views Cite this report Responses(0)

Not Approved

Zaccheus et al. present an analysis comparing 3 machine learning approaches for the classification of "successful aging" (SA) from electronic medical records from 2,000 patients from a single hospital.

The paper needs further work on its premise since the authors seem to be using input variables that are part of the definition of the outcome. Furthermore, there are insufficient details on provided on the methodology.

Here are my suggestions:

1) The use of Rowe and Kahn definition of "Successful Aging" is not widely accepted. That said, it was not clear how the principles posed by Rowe and Kahn were applied to create the outcome variable.

2) The authors should make clearer why they selected the input variables for SA classification. I imagine the presence of some of these input variables (e.g., diseases, ADLs, QOL) are part of the definition of "SA." Are the authors merely trying to recreate Rowe & Kahn's model with the addition of other variables?

3) The arbitrary selection of machine learning algorithms needs further justification. The SVM and Naive Bayes approaches, I would predict a priori, to perform worse than any neural network.

4) The authors were unclear as to how the feature selection was actually conducted. We will need more detail than those variables being "most pertinent to SA."

5) The authors did not validate their models using an independent sample, which explains their overly optimistic model performance.

6) The confusion matrix in Figure 4 is incorrect.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

No

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Aging biomarkers and prediction modeling using machine learning.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] 1. Lin Y-H, Chen Y-C, Tseng Y-C, et al.: Physical activity and successful aging among middle-aged and older adults: a systematic review and meta-analysis of cohort studies. Aging (Albany NY). 2020; 12(9): 7704–7716. PubMed Abstract | Publisher Full Text | Free Full Text

[2] 2. Chandraa CE, Abdullaha S: Forecasting mortality trend of Indonesian old aged population with bayesian method. Int. J. Adv. Sci. Eng. Inf. Technol. 2022; 12(2): 580–588. Publisher Full Text

[3] 3. Seyda Seydel G, Kucukoglu O, Altinbasv A, et al.: Economic growth leads to increase of obesity and associated hepatocellular carcinoma in developing countries. Ann. Hepatol. 2016; 15(5): 662–672. PubMed Abstract | Publisher Full Text

[4] 4. Mbam KC, Halvorsen CJ, Okoye UO: Aging in Nigeria: A Growing Population of Older Adults Requires the Implementation of National Aging Policies. Gerontologist. 2022 Oct 19; 62(9): 1243–1250. PubMed Abstract | Publisher Full Text

[5] 5. Wang Q, Li L: The effects of population aging, life expectancy, unemployment rate, population density, per capita GDP, urbanization on per capita carbon emissions. Sustain Product Consum. 2021; 28: 760–774. Publisher Full Text

[6] 6. Kiziltan M: The Effects of Population Aging and Life Expectancy on Economic Growth: The Case of Emerging Market Economies.Bayar Y, editor. Handbook of Research on Economic and Social Impacts of Population Aging. IGI Global; 2021; pp. 97–118. ch007. Publisher Full Text

[7] 7. Seong MH, Shin E, Sok S: Successful aging perception in middle-aged korean men: aq methodology approach. Int. J. Environ. Res. Public Health. 2021; 18(6): 3095. PubMed Abstract | Publisher Full Text | Free Full Text

[8] 8. Lin L, Wang HH, Lu C, et al.: Adverse childhood experiences and subsequent chronic diseases among middle-aged or older adults in China and associations with demographic and socioeconomic characteristics. JAMA Netw. Open. 2021; 4(10): e2130143–e2130143. PubMed Abstract | Publisher Full Text | Free Full Text

[9] 9. Ferrucci L, Gonzalez-Freire M, Fabbri E, et al.: Measuring biological aging in humans: a quest. Aging Cell. 2020; 19(2): e13080. PubMed Abstract | Publisher Full Text

[10] 10. Lin E, Lin C-H, Lane H-Y: Prediction of functional outcomes of schizophrenia with genetic biomarkers using a bagging ensemble machine learning method with feature selection. Sci. Rep. 2021; 11(1): 1–8.

[11] 11. Nosraty L, Pulkki J, Raitanen J, et al.: Successful aging as a predictor of long-term care among oldest old: the vitality 90+ study. J. Appl. Gerontol. 2019; 38(4): 553–571. PubMed Abstract | Publisher Full Text

[12] 12. Mendoza-Nunez VM, Pulido-Castillo G, Correa-Munoz E, et al.: Effect of a community gerontology program on the control of metabolicsyndrome in mexican older adults. Healthcare. 2022; 10(3): 466. PubMed Abstract | Publisher Full Text | Free Full Text

[13] 13. Teater B, Chonody JM: What attributes of successful aging are important to older adults? The development of a multidimensional definition of successful aging. Soc. Work Health Care. 2020; 59(3): 161–179. PubMed Abstract | Publisher Full Text

[14] 14. Bowling A: Aspirations for older age in the 21st century: What is successful aging? Int. J. Aging Hum. Dev. 2007; 64(3): 263–297. PubMed Abstract | Publisher Full Text

[15] 15. Bosnes I, Nordahl HM, Stordal E, et al.: Lifestyle predictors of successful aging: a 20-year prospective HUNT study. PLoS One. 2019; 14(7): e0219200. PubMed Abstract | Publisher Full Text | Free Full Text

[16] 16. Rowe JW, Kahn RL: Successful aging. Gerontologist. 1997; 37(4): 433–440. Publisher Full Text

[17] 17. Shafiee M, Hazrati M, Motalebi SA, et al.: Can healthy life style predict successful aging among Iranian older adults? Med. J. Islam Repub. Iran. 2020; 34: 139.

[18] 18. Chiao CY, Hsiao CY: Comparison of personality traits and successful aging in older Taiwanese. Geriatr. Gerontol Int. 2017; 17(11): 2239–2246. PubMed Abstract | Publisher Full Text

[19] 19. Dorji L, Jullamate P, Subgranon R, et al.: Predicting factors of successful aging among community dwelling older adults in Thimphu, Bhutan. Bangkok Med. J. 2019; 15(1): 38–43. Publisher Full Text

[20] 20. Ng TP, Broekman BF, Niti M, et al.: Determinants of successful aging using a multidimensional definition among Chinese elderly in Singapore. Am. J. Geriatr. Psychiatry. 2009; 17(5): 407–416. PubMed Abstract | Publisher Full Text

[21] 21. Anton SD, Woods AJ, Ashizawa T, et al.: Successful aging: advancing the science of physical independence in older adults. Ageing Res. Rev. 2015; 24: 304–327. PubMed Abstract | Publisher Full Text | Free Full Text

[22] 22. Liu H, Byles JE, Xu X, et al.: Evaluation of successful aging among older people in China: results from China health and retirement longitudinal study. Geriatr. Gerontol. Int. 2017; 17(8): 1183–1190. PubMed Abstract | Publisher Full Text

[23] 23. Canêdo AC, Lopes CS, Lourenço RA: Prevalence of and factors associated with successful aging in Brazilian older adults: frailty in Brazilian older people study (FIBRA RJ). Geriatr. Gerontol. Int. 2018; 18(8): 1280–1285. PubMed Abstract | Publisher Full Text

[24] 24. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text

[25] 25. Raza K: Improving the prediction accuracy of heart disease with ensemble learning and majority voting rule. U-Healthcare Monitoring Systems. Elsevier; 2019; pp. 179–196. Publisher Full Text

[26] 26. Asghari Varzaneh Z, Shanbehzadeh M, Kazemi-Arpanahi H: Prediction of successful aging using ensemble machine learning algorithms. BMC Med. Inform. Decis. Mak. 2022; 22: 258. PubMed Abstract | Publisher Full Text | Free Full Text

[27] 27. Cai T, Long J, Kuang J, et al.: Applying machine learning methods to develop a successful aging maintenance prediction model based on physical fitness tests. Geriatr. Gerontol. Int. 2020; 20(6): 637–642. PubMed Abstract | Publisher Full Text

[28] 28. Maryam A, Raoof N, Somayeh N: Designing a Predictive Model for Successful Aging among the Elderly Using Machine Learning Techniques.2022. Publisher Full Text

[29] 29. Nagarajan NR, Teixeira AA, Silva ST: Ageing population: identifying the determinants of ageing in the least developed countries. Popul. Res. Policy Rev. 2021; 40(2): 187–210. Publisher Full Text

[30] 30. Olson DL: Data set balancing. In: Chinese Academy of Sciences Symposium on Data Mining and Knowledge Management. Berlin, Heidelberg: Springer; 2004 Jul 12; 71–80.

[31] 31. Chandrashekar G, Sahin F: A survey on feature selection methods. Comput. Electr. Eng. 2014; 40(1): 16–28. Publisher Full Text

[32] 32. Li J, Cheng K, Wang S, et al.: Feature selection: a data perspective. ACM Comput. Surv. 2017; 50(6): 1–45. Publisher Full Text

[33] 33. Guan Z-J, Li R, Jiang J-T, et al.: Data mining and design of electromagnetic properties of Co/FeSi filled coatings based on genetic algorithms optimized artificial neural networks (GA-ANN). Compos. B Eng. 2021; 226: 109383. Publisher Full Text

[34] 34. Sinaga LM, Suwilo S: Analysis of classification and Naïve Bayes algorithm k-nearest neighbor in data mining. IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020; Vol. 725(1): p. 012106.

[35] 35. Sembiring M, Tambunan R: Analysis of graduation prediction on time based on student academic performance using the Naïve Bayes Algorithm with data mining implementation (Case study: Department of Industrial Engineering USU). IOP Conference Series: Materials Science and Engineering: 2021. IOP Publishing; 2021; p. 012069.

[36] 36. Gopinath C, Manikanta J: Performance Analysis Based on Data Mining Technique in Predicting the Diabetic Disease-Decision tree and Naïve Bayes. 2019 1st International Conference on Advances in Information Technology (ICAIT): 2019. IEEE; 2019; pp. 525–528.

[37] 37. Prasetya R, Ridwan A: Data mining application on weather prediction using classification tree, naïve bayes and K-nearest neighbor algorithm with model testing of supervised learning probabilistic brier score, confusion matrix and ROC. J. Appl. Commun. Inf. Technol. 2020; 4(2): 25–33. Publisher Full Text

[38] 38. Khazaei S, Najafi-GhOBADI S, Ramezani-Doroh V: Construction data mining methods in the prediction of death in hemodialysis patients using support vector machine, neural network, logistic regression and decision tree. J. Prev. Med. Hyg. 2021; 62(1): E222–E230. PubMed Abstract | Publisher Full Text

[39] 39. Chidambaram S, Srinivasagan K: Performance evaluation of support vector machine classification approaches in data mining. Clust. Comput. 2019; 22(1): 189–196. Publisher Full Text

[40] 40. Srivastava N, Hinton G, Krizhevsky A, et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014; 15(1): 1929–1958. Publisher Full Text

[41] 41. Byeon H: Exploring factors for predicting anxiety disorders of the elderly living alone in South Korea using interpretable machine learning: a population-based study. Int. J. Environ. Res. Public Health. 2021; 18(14): 7625. PubMed Abstract | Publisher Full Text | Free Full Text

[42] 42. Zaccheus J: Successful Aging Dataset for Elderly Patients.2023. Publisher Full Text

Towards successful aging classification using machine learning algorithms

Abstract

Keywords

Revised Amendments from Version 1

Introduction

Methodology

System architecture

Figure 1. Proposed system architecture.

Study parameters

Data collection and pre-processing

Feature selection

Figure 2. Correlation between features of the dataset.

Figure 3. Cross-section of the input dataset.

Development of classification models

Table 1. Model hyperparameters.

Design of classification models

Evaluation of the machine learning models

Table 2. Performance metrics formula.

Results and discussion

Experimental results for SVM, NB, and ANN models

Figure 4. Confusion matrix for SVM50 classifier.

Figure 5. Confusion matrix for NB50 classifier.

Figure 6. Confusion matrix for ANN50 classifier.

Figure 7. Confusion matrix for SVM70 classifier.

Figure 8. Confusion matrix for NB70 classifier.

Figure 9. Confusion matrix for ANN70 classifier.

ANN model accuracy and loss plots

Figure 10. Training and validation accuracy of ANN50 model.

Figure 11. Training and validation loss of ANN50 model.

Performance of all the ML models

Table 3. Evaluation of the efficiency of ML models.

Figure 12. Bar graph showing performance of machine learning models.

Discussion

Table 4. Comparison of the proposed optimal model with existing methods for the prediction of Successful Aging.

Conclusion

Declarations

Authors contribution

Data availability

Software availability

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated