Keywords
Medical adherence, home hospitalization, machine learning
This article is included in the Research Synergy Foundation gateway.
This article is included in the Artificial Intelligence and Machine Learning gateway.
Medical adherence, home hospitalization, machine learning
Medical adherence is described as a patient's willingness to follow their doctor's treatment plan by taking the medications recommended to them.1 Failure to follow a treatment plan might result in negative clinical outcomes and a significant rise in hospitalisation cost.2 According to a recent study, most patients do not follow their doctors' advice, resulting in a considerable rise in hospitalizations and medical visits.3
Many researchers and healthcare firms have been focusing on home hospitalization technology in recent years.4 Many approaches are used to identify medical adherence: some employ blood testing, some employ patient self-assessment feedback, and others employ pill dispensers to determine whether the medicine was taken. These solutions can give important information on a patient's behaviour, but they are ineffective, as some solutions are costly to operate, while others do not provide information on the type of illness.
By proposing a method for continuous monitoring of the patient's vital signs, this study intends to develop a solution that can aid doctors in monitoring the patient's health state and medication intake. We used machine learning to determine the patient's health status and match it to the pharmaceutical consumption schedule defined by doctors. Major vital indicators are taken by off-the-shelf medically approved devices. The data is logged and analysed, and the result is presented to doctors in the form of a suggestion, which they may use to alter medicine dose or frequency, or to schedule a doctor visit for the patient.
A lot of research has been focused on medical adherence and home hospitalisation.5–12 Patients' in-hospital days were reduced when they were given a home hospitalisation option.5 The abilities to remotely monitor the patient's health state and check their adherence to prescriptions are critical factors for enabling home hospitalisation. Tripathi et al.6 proposed to gather information about the patient via tracking sensors and wearable devices, then transmitting the data to an Internet server, where the decision is taken whether to contact family members, an ambulance, or clinical aid. To enable a smart health environment, Zulkifli et al.7 created a health monitoring and information system. Patients may use their cell phones to submit feedback to physicians, and doctors will reply to their reports to determine if the patient needs an appointment or can continue therapy from home.
Due to the benefits of reducing hospital stays, cutting treatment costs, and freeing up doctors' time, home hospitalisation has garnered a lot of attention. Early release care, which permits patients to stay at home for a portion of their inpatient therapy, was studied by Hernández et al.8 They urged nurses to visit these patients on a regular basis and record their vital signs in an online system that doctors could access. Federman et al.9 assessed hospitalised patients and sent some of them home to undergo treatment. Nurses and health experts evaluated patients' vital signs before asking all patients to rate their treatment experience, with the results indicating that those who received therapy at home obtained a higher rating. Sherif et al.10 suggested a system to facilitate home hospitalisation, by tracking patients' medicine consumption using integrated electronics. Patients were instructed to record their medication adherence using an alarm button that was linked to a monitoring dashboard. While this strategy is useful for the monitoring of patients, it cannot assure that the medicines have been taken.
Similarly, Daramola et al.11 used a smartphone application to report medicine consumption and relied on patient self-reporting. Kumar et al.12 presented a similar method for measuring medication adherence using medicine dispensers. However, these approaches do not guarantee that the medicines have been consumed. To overcome these limitations, this paper intends to verify medicine intake by monitoring patients' vital signs and health status regularly.
The study's main goal is to improve patients' medication adherence without the need of special technology, nurse or caregiver monitoring. Thousands of people have been hospitalised because of the current COVID-19 pandemic. Many healthcare institutions have run out of resources to handle the rising number of patients.13 Thus, most people with minor symptoms have been recommended to stay at home and monitor their health.13 This pandemic has highlighted the critical necessity for home hospitalisation. Consequently, we propose a complementary tool that assists doctors in monitoring patients’ health from home.
The method proposed in this paper consists of two sections. The first section is data preparation and pre-processing, while the second section is creating a classification model to identify the health status category based on the vital signs.
Blood pressure, body temperature, heart rate and blood oxygen saturation level can be considered as good indicators to predict the health status of the patient. The initial stage in data preparation was to assemble a labelled dataset collected by physicians to anticipate the patient's health status. We adopted a publicly available medical dataset called Medical Information Mart for Intensive Care III (MIMIC-III) Clinical Database, which contains health data from over 40,000 patients hospitalised in intensive care units.14 The dataset consists of numerous tables with patients’ information, such as vital signs and doctor diagnoses. The caregivers documented the data collected on an hourly basis. However, the data from a single patient is saved in different tables across the dataset. As a result, the data must be restructured before they can be used for model training. Patients who had more than four daily measurements, including blood oxygen saturation, blood pressure, body temperature, and heart rate, as well as a doctor's diagnosis, were selected. These readings were divided into two classes: the first class included those with two or fewer diagnoses while the second class comprised those with more than two diagnoses. To predict the health status of the patients, we needed to train the machine learning model with data from healthy people, who were not diagnosed with any diseases. Since vital sign data from healthy people were not widely available, we decided to collect these data with the help of a general physician. The resulting dataset consists of three classes: the first class corresponding to healthy people; the second class includes patients with up to two diagnoses; the third class includes patients with more than two diagnoses.
Three multiclass classification models were proposed to help doctors by providing a secondary opinion based on vital signs, to determine how serious a patient’s health condition is. The prediction was based on the labelled dataset, which consisted of three classes. The classifiers were chosen, based on the size of the dataset, were k-nearest neighbour classifier, linear support vector machine (SVM) classifier, and logistic regression classifier. Scikit-learn v. 0.23 was used to train these models, and is well-known for its classification methods.15 The purpose of the classifiers was to forecast the patient's health status class based on their vital signs. Diagnoses used in our prepared dataset were related to heart disease, respiratory sickness and hypertension, which the doctors would subsequently verify or reject in order to offer feedback data that could be used to train a more accurate model.
A well-labelled and filtered dataset with five variables and one label was prepared. The mean values for blood oxygen saturation level, diastolic blood pressure, systolic blood pressure, heart rate, and body temperature were used as variables in the prepared dataset, as well as a label that indicated whether the measurements were normal, abnormal or abnormal with a serious condition. The variables included in the prepared dataset are listed below:
• Diastolic blood pressure
• Systolic blood pressure
• Heart rate
• Blood oxygen saturation level
• Temperature
• Label
A total of 1,382 rows were collected after filtering data for those patients who had been diagnosed with diseases related to the vital signs, including only those who had more than three readings per day. A similar approach was utilised to get 102 “normal” readings manually with the aid of a medical expert. Both tables were then concatenated; the patient ID was removed, and labels reading "0" were assigned to normal readings, while "1" was assigned to abnormal readings up to two diagnoses, and “2” was assigned to abnormal readings with more than two diagnoses.
Although we had collected as many data for healthy people as possible, the dataset was still imbalanced, with the majority of the data samples coming from the MIMIC-III dataset, resulting in biasing.
As shown in Table 1, the dataset consisted of 102 rows of healthy data samples (Class 0), 813 rows of low-risk data samples (Class 1) and 569 rows of high-risk data samples (Class 2). A 10% fraction of the data samples was used for testing. The rest of the dataset was split using repeated stratified K-Fold cross-validation, with four splits, and ten repetitions. One significant aspect of this validation technique is its balanced training weights and its testing of data while handling unbalanced data, since the percentage of observations with a particular categorical value is the same.14 Multiple splits and repetitions allowed the training and evaluation of different iterations of the same model, with training and test outcomes differing depending on the sampling data. After the data were divided into training and test sets, we trained three distinct models to assess their performance. These processing stages are shown in the system architecture of the proposed solution as depicted in Figure 1.
The first trained model was linear SVC. The settings of the hyperparameters used were ‘C’ set to 1, ‘max_iter’ set to 1000 and ‘intercept_scaling’ set to 1. This model exhibited a relatively low performance in predicting patients with more than three diagnoses, as shown in the confusion matrix shown in Table 2. The evaluation results in terms of precision and recall for the first model are presented in Table 3.
Predicted class | ||||
---|---|---|---|---|
0 | 1 | 2 | ||
True class | 0 | 10 | 0 | 0 |
1 | 0 | 50 | 31 | |
2 | 0 | 36 | 21 |
Precision | Recall | |
---|---|---|
Class 0 | 1.00 | 1.00 |
Class 1 | 0.58 | 0.62 |
Class 2 | 0.40 | 0.37 |
The second model trained was based on logistic regression, constructed with ‘max_iter’ set to 1000 and ‘C’ set to 1. As shown in Table 4 and Table 5, the logistic regression model demonstrated a modest performance in relation to the previous classifier with, better precision and recall for Class 2. Nevertheless, the recall for Class 1 was considerably lower.
Predicted class | ||||
---|---|---|---|---|
0 | 1 | 2 | ||
True class | 0 | 10 | 0 | 0 |
1 | 0 | 21 | 60 | |
2 | 0 | 13 | 44 |
Precision | Recall | |
---|---|---|
Class 0 | 1.00 | 1.00 |
Class 1 | 0.62 | 0.26 |
Class 2 | 0.42 | 0.77 |
The third model was the k-nearest neighbour classifier, trained with 'n_neighbors' set to 5, 'weights' set to ‘uniform’ and using Euclidean distance. The confusion matrix generated by applying the k-NN classifier on the test data is shown in Table 6. The results for the trained model are presented in Table 7. The performance of the model is on par with the first model, but worse than the second model for Class 2 in terms of recall.
Predicted class | ||||
---|---|---|---|---|
0 | 1 | 2 | ||
True class | 0 | 10 | 0 | 0 |
1 | 0 | 58 | 23 | |
2 | 0 | 40 | 17 |
Precision | Recall | |
---|---|---|
Class 0 | 1.00 | 1.00 |
Class 1 | 0.59 | 0.72 |
Class 2 | 0.43 | 0.30 |
After training three different classification models, we trained a majority-vote classifier, which aggregated the predictions of the three trained models and made the predictions based on the class that obtained the greatest number of votes. The results are presented in Table 8 and Table 9.
Predicted class | ||||
---|---|---|---|---|
0 | 1 | 2 | ||
True class | 0 | 10 | 0 | 0 |
1 | 0 | 45 | 36 | |
2 | 0 | 29 | 28 |
Precision | Recall | |
---|---|---|
Class 0 | 1.00 | 1.00 |
Class 1 | 0.61 | 0.56 |
Class 2 | 0.44 | 0.49 |
In this study, the majority-vote classifier was chosen according to its overall performance in terms of precision and recall, in comparison to the results of the other three trained models. Further improvements will be made in the future regarding imbalanced data training, or by adding more minority data to the dataset. The positive results suggest that the proposed classification model is feasible in predicting severity of health condition, especially between normal people and those with medical risks.
The goal of this study was to improve medical adherence among patients who get therapy at home, by tracking their health status using supervised machine learning models. We are confident that the proposed solution will increase medicine adherence and provide a way to enable home hospitalisation, which is now in great demand. The trained model can predict how serious a patient's health status is, as determined by the four measured vital signs. Clinicians should be able to accept or reject the model's predictions, providing feedback which can then be utilised to train and enhance another version of the model. To further improve the performance of the model, additional input variables such as medical history can be considered.
The experiments in this work were carried out using data from the MIMIC-III Clinical Database: https://doi.org/10.13026/C2XW26.14
Zenodo: Five vital signs of normal people, https://doi.org/10.5281/zenodo.5549632.16
This project contains the anonymised vital signs data from healthy patients collected retrospectively.
Analysis code available from: https://github.com/abubakerSherif/A-Health-Status-Classification-Model/tree/v1.0
Archived analysis code as at time of publication: https://doi.org/10.5281/zenodo.5551854
License: MIT
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Is the rationale for developing the new method (or application) clearly explained?
No
Is the description of the method technically sound?
No
Are sufficient details provided to allow replication of the method development and its use by others?
Yes
If any results are presented, are all the source data underlying the results available to ensure full reproducibility?
Partly
Are the conclusions about the method and its performance adequately supported by the findings presented in the article?
No
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Continuous vital sign data, postoperative complications,
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | |
---|---|
1 | |
Version 1 20 Oct 21 |
read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)