ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Method Article

Health status classification model for medical adherence system in retirement township

[version 1; peer review: 1 not approved]
PUBLISHED 20 Oct 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Medical adherence and remote patient monitoring have gained huge attention from researchers recently, especially with the need to observe the patients’ health outside hospitals due to the ongoing pandemic. The main goal of this research work is to propose a health status classification model that provides a numerical indicator of the overall health condition of a patient via four major vital signs, which are body temperature, blood pressure, blood oxygen saturation level, and heart rate. A dataset has been prepared based on the data obtained from hospital records, with these four vital signs extracted for each patient. This dataset provides a label associating each patient to the number of medical diagnoses. Generally, the number of diagnoses correlates with the patient's medical condition, with no diagnoses indicating normal condition, one to two diagnoses suggest low risk, and more than that implies high risk. Thus, we propose a method to classify a patient’s health status into three classes, which are normal, low risk and high risk. This would provide guidance for healthcare workers on the patient's medical condition. By training the classification model using the prepared dataset, the seriousness of a patient's health condition can be predicted. This prediction is performed by classifying the patients based on their four vital signs. Our tests have yielded encouraging results using precision and recall as the evaluation metrics. The key outcome of this work is a trained classification model that quantifies a patient's health condition based on four vital signs. Nevertheless, the model can be further improved by considering more input features such as medical history. The results obtained from this research can assist medical personnel by providing a secondary advice regarding the health status for the patients who are located remotely from the medical facilities.

Keywords

Medical adherence, home hospitalization, machine learning

Introduction

Background

Medical adherence is described as a patient's willingness to follow their doctor's treatment plan by taking the medications recommended to them.1 Failure to follow a treatment plan might result in negative clinical outcomes and a significant rise in hospitalisation cost.2 According to a recent study, most patients do not follow their doctors' advice, resulting in a considerable rise in hospitalizations and medical visits.3

Many researchers and healthcare firms have been focusing on home hospitalization technology in recent years.4 Many approaches are used to identify medical adherence: some employ blood testing, some employ patient self-assessment feedback, and others employ pill dispensers to determine whether the medicine was taken. These solutions can give important information on a patient's behaviour, but they are ineffective, as some solutions are costly to operate, while others do not provide information on the type of illness.

By proposing a method for continuous monitoring of the patient's vital signs, this study intends to develop a solution that can aid doctors in monitoring the patient's health state and medication intake. We used machine learning to determine the patient's health status and match it to the pharmaceutical consumption schedule defined by doctors. Major vital indicators are taken by off-the-shelf medically approved devices. The data is logged and analysed, and the result is presented to doctors in the form of a suggestion, which they may use to alter medicine dose or frequency, or to schedule a doctor visit for the patient.

Literature review

A lot of research has been focused on medical adherence and home hospitalisation.512 Patients' in-hospital days were reduced when they were given a home hospitalisation option.5 The abilities to remotely monitor the patient's health state and check their adherence to prescriptions are critical factors for enabling home hospitalisation. Tripathi et al.6 proposed to gather information about the patient via tracking sensors and wearable devices, then transmitting the data to an Internet server, where the decision is taken whether to contact family members, an ambulance, or clinical aid. To enable a smart health environment, Zulkifli et al.7 created a health monitoring and information system. Patients may use their cell phones to submit feedback to physicians, and doctors will reply to their reports to determine if the patient needs an appointment or can continue therapy from home.

Due to the benefits of reducing hospital stays, cutting treatment costs, and freeing up doctors' time, home hospitalisation has garnered a lot of attention. Early release care, which permits patients to stay at home for a portion of their inpatient therapy, was studied by Hernández et al.8 They urged nurses to visit these patients on a regular basis and record their vital signs in an online system that doctors could access. Federman et al.9 assessed hospitalised patients and sent some of them home to undergo treatment. Nurses and health experts evaluated patients' vital signs before asking all patients to rate their treatment experience, with the results indicating that those who received therapy at home obtained a higher rating. Sherif et al.10 suggested a system to facilitate home hospitalisation, by tracking patients' medicine consumption using integrated electronics. Patients were instructed to record their medication adherence using an alarm button that was linked to a monitoring dashboard. While this strategy is useful for the monitoring of patients, it cannot assure that the medicines have been taken.

Similarly, Daramola et al.11 used a smartphone application to report medicine consumption and relied on patient self-reporting. Kumar et al.12 presented a similar method for measuring medication adherence using medicine dispensers. However, these approaches do not guarantee that the medicines have been consumed. To overcome these limitations, this paper intends to verify medicine intake by monitoring patients' vital signs and health status regularly.

Motivation

The study's main goal is to improve patients' medication adherence without the need of special technology, nurse or caregiver monitoring. Thousands of people have been hospitalised because of the current COVID-19 pandemic. Many healthcare institutions have run out of resources to handle the rising number of patients.13 Thus, most people with minor symptoms have been recommended to stay at home and monitor their health.13 This pandemic has highlighted the critical necessity for home hospitalisation. Consequently, we propose a complementary tool that assists doctors in monitoring patients’ health from home.

Ethical considerations

This study received ethical approval from the ethical Review Board of the Multimedia University, Technology Transfer Office, Malaysia (EA0962021).

As healthy patients’ data was collected retrospectively and anonymised, written informed consent was waived.

Methods

The method proposed in this paper consists of two sections. The first section is data preparation and pre-processing, while the second section is creating a classification model to identify the health status category based on the vital signs.

Data preparation

Blood pressure, body temperature, heart rate and blood oxygen saturation level can be considered as good indicators to predict the health status of the patient. The initial stage in data preparation was to assemble a labelled dataset collected by physicians to anticipate the patient's health status. We adopted a publicly available medical dataset called Medical Information Mart for Intensive Care III (MIMIC-III) Clinical Database, which contains health data from over 40,000 patients hospitalised in intensive care units.14 The dataset consists of numerous tables with patients’ information, such as vital signs and doctor diagnoses. The caregivers documented the data collected on an hourly basis. However, the data from a single patient is saved in different tables across the dataset. As a result, the data must be restructured before they can be used for model training. Patients who had more than four daily measurements, including blood oxygen saturation, blood pressure, body temperature, and heart rate, as well as a doctor's diagnosis, were selected. These readings were divided into two classes: the first class included those with two or fewer diagnoses while the second class comprised those with more than two diagnoses. To predict the health status of the patients, we needed to train the machine learning model with data from healthy people, who were not diagnosed with any diseases. Since vital sign data from healthy people were not widely available, we decided to collect these data with the help of a general physician. The resulting dataset consists of three classes: the first class corresponding to healthy people; the second class includes patients with up to two diagnoses; the third class includes patients with more than two diagnoses.

Training the machine learning model

Three multiclass classification models were proposed to help doctors by providing a secondary opinion based on vital signs, to determine how serious a patient’s health condition is. The prediction was based on the labelled dataset, which consisted of three classes. The classifiers were chosen, based on the size of the dataset, were k-nearest neighbour classifier, linear support vector machine (SVM) classifier, and logistic regression classifier. Scikit-learn v. 0.23 was used to train these models, and is well-known for its classification methods.15 The purpose of the classifiers was to forecast the patient's health status class based on their vital signs. Diagnoses used in our prepared dataset were related to heart disease, respiratory sickness and hypertension, which the doctors would subsequently verify or reject in order to offer feedback data that could be used to train a more accurate model.

Results and discussion

A well-labelled and filtered dataset with five variables and one label was prepared. The mean values for blood oxygen saturation level, diastolic blood pressure, systolic blood pressure, heart rate, and body temperature were used as variables in the prepared dataset, as well as a label that indicated whether the measurements were normal, abnormal or abnormal with a serious condition. The variables included in the prepared dataset are listed below:

  • Diastolic blood pressure

  • Systolic blood pressure

  • Heart rate

  • Blood oxygen saturation level

  • Temperature

  • Label

A total of 1,382 rows were collected after filtering data for those patients who had been diagnosed with diseases related to the vital signs, including only those who had more than three readings per day. A similar approach was utilised to get 102 “normal” readings manually with the aid of a medical expert. Both tables were then concatenated; the patient ID was removed, and labels reading "0" were assigned to normal readings, while "1" was assigned to abnormal readings up to two diagnoses, and “2” was assigned to abnormal readings with more than two diagnoses.

Although we had collected as many data for healthy people as possible, the dataset was still imbalanced, with the majority of the data samples coming from the MIMIC-III dataset, resulting in biasing.

As shown in Table 1, the dataset consisted of 102 rows of healthy data samples (Class 0), 813 rows of low-risk data samples (Class 1) and 569 rows of high-risk data samples (Class 2). A 10% fraction of the data samples was used for testing. The rest of the dataset was split using repeated stratified K-Fold cross-validation, with four splits, and ten repetitions. One significant aspect of this validation technique is its balanced training weights and its testing of data while handling unbalanced data, since the percentage of observations with a particular categorical value is the same.14 Multiple splits and repetitions allowed the training and evaluation of different iterations of the same model, with training and test outcomes differing depending on the sampling data. After the data were divided into training and test sets, we trained three distinct models to assess their performance. These processing stages are shown in the system architecture of the proposed solution as depicted in Figure 1.

Table 1. Prepared dataset information.

Class 0Class 1Class 2
Training92732512
Testing108157
6fc3ca56-f39b-4d21-b331-84bcc8346e3e_figure1.gif

Figure 1. System architecture.

The first trained model was linear SVC. The settings of the hyperparameters used were ‘C’ set to 1, ‘max_iter’ set to 1000 and ‘intercept_scaling’ set to 1. This model exhibited a relatively low performance in predicting patients with more than three diagnoses, as shown in the confusion matrix shown in Table 2. The evaluation results in terms of precision and recall for the first model are presented in Table 3.

Table 2. Confusion matrix of linear support vector machine (SVM) classifier.

Predicted class
012
True class01000
105031
203621

Table 3. Precision and recall values of linear support vector machine (SVM) classifier.

PrecisionRecall
Class 01.001.00
Class 10.580.62
Class 20.400.37

The second model trained was based on logistic regression, constructed with ‘max_iter’ set to 1000 and ‘C’ set to 1. As shown in Table 4 and Table 5, the logistic regression model demonstrated a modest performance in relation to the previous classifier with, better precision and recall for Class 2. Nevertheless, the recall for Class 1 was considerably lower.

Table 4. Confusion matrix of logistic regression classifier.

Predicted class
012
True class01000
102160
201344

Table 5. Precision and recall values of the logistic regression classifier.

PrecisionRecall
Class 01.001.00
Class 10.620.26
Class 20.420.77

The third model was the k-nearest neighbour classifier, trained with 'n_neighbors' set to 5, 'weights' set to ‘uniform’ and using Euclidean distance. The confusion matrix generated by applying the k-NN classifier on the test data is shown in Table 6. The results for the trained model are presented in Table 7. The performance of the model is on par with the first model, but worse than the second model for Class 2 in terms of recall.

Table 6. Confusion matrix of k-nearest neighbour classifier.

Predicted class
012
True class01000
105823
204017

Table 7. Precision and recall values of k-nearest neighbour classifier.

PrecisionRecall
Class 01.001.00
Class 10.590.72
Class 20.430.30

After training three different classification models, we trained a majority-vote classifier, which aggregated the predictions of the three trained models and made the predictions based on the class that obtained the greatest number of votes. The results are presented in Table 8 and Table 9.

Table 8. Confusion matrix of the majority-vote classifier.

Predicted class
012
True class01000
104536
202928

Table 9. Precision and recall values of majority-vote classifier.

PrecisionRecall
Class 01.001.00
Class 10.610.56
Class 20.440.49

In this study, the majority-vote classifier was chosen according to its overall performance in terms of precision and recall, in comparison to the results of the other three trained models. Further improvements will be made in the future regarding imbalanced data training, or by adding more minority data to the dataset. The positive results suggest that the proposed classification model is feasible in predicting severity of health condition, especially between normal people and those with medical risks.

Conclusions

The goal of this study was to improve medical adherence among patients who get therapy at home, by tracking their health status using supervised machine learning models. We are confident that the proposed solution will increase medicine adherence and provide a way to enable home hospitalisation, which is now in great demand. The trained model can predict how serious a patient's health status is, as determined by the four measured vital signs. Clinicians should be able to accept or reject the model's predictions, providing feedback which can then be utilised to train and enhance another version of the model. To further improve the performance of the model, additional input variables such as medical history can be considered.

Data availability

Underlying data

The experiments in this work were carried out using data from the MIMIC-III Clinical Database: https://doi.org/10.13026/C2XW26.14

Extended data

Zenodo: Five vital signs of normal people, https://doi.org/10.5281/zenodo.5549632.16

This project contains the anonymised vital signs data from healthy patients collected retrospectively.

Analysis code available from: https://github.com/abubakerSherif/A-Health-Status-Classification-Model/tree/v1.0

Archived analysis code as at time of publication: https://doi.org/10.5281/zenodo.5551854

License: MIT

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 20 Oct 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Abubaker Sherif AF, Tan WH, Ooi CP and Tan YF. Health status classification model for medical adherence system in retirement township [version 1; peer review: 1 not approved]. F1000Research 2021, 10:1065 (https://doi.org/10.12688/f1000research.73332.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 20 Oct 2021
Views
5
Cite
Reviewer Report 19 Feb 2024
Jesper Mølgaard, Copenhagen University hospital, Copenhagen, Denmark 
Not Approved
VIEWS 5
The authors of this article report on the results of using three different statistical methods for discriminating between 3 patient categories. They propose this model for helping healthcare workers understand which patients need medical attention and which do not. ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Mølgaard J. Reviewer Report For: Health status classification model for medical adherence system in retirement township [version 1; peer review: 1 not approved]. F1000Research 2021, 10:1065 (https://doi.org/10.5256/f1000research.76976.r234456)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 20 Oct 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.