ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Optimised deep neural network model to predict asthma exacerbation based on personalised weather triggers

[version 1; peer review: 3 approved]
PUBLISHED 10 Sep 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

This article is included in the Health Services gateway.

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Background – Recently, there have been attempts to develop mHealth applications for asthma self-management. However, there is a lack of applications that can offer accurate predictions of asthma exacerbation using the weather triggers and demographic characteristics to give tailored response to users. This paper proposes an optimised Deep Neural Network Regression (DNNR) model to predict asthma exacerbation based on personalised weather triggers.
Methods – With the aim of integrating weather, demography, and asthma tracking, an mHealth application was developed where users conduct the Asthma Control Test (ACT) to identify the chances of their asthma exacerbation. The asthma dataset consists of panel data from 10 users that includes 1010 ACT scores as the target output. Moreover, the dataset contains 10 input features which include five weather features (temperature, humidity, air-pressure, UV-index, wind-speed) and five demography features (age, gender, outdoor-job, outdoor-activities, location).
Results – Using the DNNR model on the asthma dataset, a score of 0.83 was achieved with Mean Absolute Error (MAE)=1.44 and Mean Squared Error (MSE)=3.62. It was recognised that, for effective asthma self-management, the prediction errors must be in the acceptable loss range (error<0.5). Therefore, an optimisation process was proposed to reduce the error rates and increase the accuracy by applying standardisation and fragmented-grid-search. Consequently, the optimised-DNNR model (with 2 hidden-layers and 50 hidden-nodes) using the Adam optimiser achieved a 94% accuracy with MAE=0.20 and MSE=0.09.
Conclusions – This study is the first of its kind that recognises the potentials of DNNR to identify the correlation patterns among asthma, weather, and demographic variables. The optimised-DNNR model provides predictions with a significantly higher accuracy rate than the existing predictive models and using less computing time. Thus, the optimisation process is useful to build an enhanced model that can be integrated into the asthma self-management for mHealth application.

Keywords

Machine learning, deep neural network, personalisation, asthma self-management

Introduction

Asthma is a chronic lung disease that affects people of all age groups around the world.1 Asthma exacerbation causes asthma attacks, and the frequency of asthma attacks depends on the exposure to asthma triggers.2 Weather is a common triggering factor of asthma exacerbation.3 Studies show that weather triggers, such as temperature, humidity, air pressure, and wind, cause asthma attacks.46 Weather impact is specific to individual asthmatic patients due to their lung performance, which varies among patients. This depends on their demographic characteristics, such as age and gender.7 Geographical location is also a factor because the association between weather triggers and asthma is inconsistent in different climate regions.4

Although asthma cannot be cured, avoiding exposure to weather triggers through asthma self-management can minimise the risk of asthma exacerbation.8 Recently, there have been attempts to develop mHealth applications to assist asthma self-management. However, until now, no application for effective asthma self-management exists that has been widely adopted by users or integrated into primary asthma care records.2 This is because there is a lack of solutions that can offer accurate predictions of asthma exacerbation based on personalised weather triggers and provide tailored feedback to users.

Deep Neural Network (DNN) is a type of neural network algorithm with multiple hidden layers and several nodes.9 In recent years, DNN has been significantly utilised in the health informatics research domain for forecasting and pattern recognition.1013 This is because DNN models tend to learn more effectively and have better performance in providing accurate predictions (especially through optimisation) than traditional Machine Learning (ML) algorithms.14 Nevertheless, the application of ML and DNN in weather-based healthcare is still in its infancy. In fact, to the best of the authors’ knowledge, none of the existing research has applied DNN to predict asthma exacerbation based on demography and weather. Therefore, the main contribution of the work in this paper is to apply DNN and propose an optimisation process to predict asthma exacerbation based on personalised weather triggers with low error and high accuracy. The findings will be helpful for developing mHealth solutions with personalisation for effective asthma self-management.

Methods

Data collection

With the aim of integrating weather, demography, and asthma tracking, an mHealth application, namely Weather Asthma (WEA), was developed for this study.15 The WEA is an android-based application that collects user demography and monitors daily weather forecasts in individual users’ location to identify the potential weather triggers. Consequently, both demography and weather data are selected as input features in the asthma dataset.

The WEA application also allows users to conduct the Asthma Control Test (ACT).16 The ACT is a self-administered survey which is considered the standard assessment for monitoring chronic asthma and recommended by the Global Initiative for Asthma.17 The ACT score is selected as the target output for prediction because it helps identify the severity and chances of asthma exacerbation.16

Data was collected through the WEA application from ten participants with asthma over a period of one-year. Participants conducted ACTs by regularly answering five multiple-choice questions, which include four asthma symptom-related questions and one asthma self-evaluate question. Each question is scored between 1-5. Once the ACT was submitted, a timestamp was formed with the participant’s demography and the weather information of that day and time at their location. This timestamp, along with the total ACT score, are stored in the database, as seen in Table 1. All participants consented to the data collection and the ethical approval was obtained from the Multimedia University Research Ethics Committee (EA1532021).

Table 1. Data collection example.

Data nameExample
User ID5QBEe959GPOxv3rDNYXZX
Timestamp1607926290609
ACT score20
AgeAbove 50
GenderMale
Outdoor jobOccasionally
Outdoor activitesExtremely likely
Smoking habitNo
Temperature29.8°C
Humidity70%
Air pressure1009.0 hectopascals
UV indexLow (i.e. 0 to 2)
Wind speed2.1 meters per second

Data pre-processing

The first step of data pre-processing is identifying the missing data in the dataset through a heatmap, illustrated in Figure 1, which visualises the locations of missing values. Fortunately, the selected dataset does not contain any missing or NaN values.

338e5732-6440-4402-8a09-e3a7375777f9_figure1.gif

Figure 1. Heatmap visualisation.

The second step is dropping irrelevant features including “User ID” and “Timestamp”. “Smoking habit” is also dropped because its correlation coefficient value with the target variable “ACT score” in the heatmap is close to zero. Table 2 represents the final dataset, which consists of 1010 records with ten input features and one output variable. Figure 2 shows the ACT scores’ distribution, which ranges from 12 to 21. Figure 3 illustrates a scatterplot and countplot for weather features, where a strong correlation can be observed between the weather features and “ACT score”.

338e5732-6440-4402-8a09-e3a7375777f9_figure2.gif

Figure 2. ACT distribution.

338e5732-6440-4402-8a09-e3a7375777f9_figure3.gif

Figure 3. Weather visualisation.

Table 2. Input/output variables.

NameTypeDescription
AgeCategoricalInput demography
GenderCategoricalInput demography
LocationCategoricalInput demography
Outdoor activitiesCategoricalInput demography
Outdoor jobCategoricalInput demography
TemperatureNumericalInput weather
HumidityNumericalInput weather
Air pressureNumericalInput weather
UV indexCategoricalInput weather
Wind speedNumericalInput weather
ACT scoreNumericalTarget output

The third step is converting the categorical variables in the dataset to numeric representations using the label encoder. The fourth step is splitting the dataset into training (707 samples) and testing (303 samples) datasets.

Regression with DNN

DNN can be modelled with various ML techniques, such as regression and classification.18 Regression is responsible for modelling and characterising the relationship between the input features and the target output. Regression is applied to predict numerical values.19 Hence, regression is used in this study to predict the ACT score, which is a numerical value.

Consequently, a DNN Regression (DNNR) model is applied on the dataset. In DNNR, the hidden layers are located between the input layer and the output layer, as seen in Figure 4. The hidden layers apply weights to input values and direct them via an activation function for the output values.20 The activation function assists in deriving distinguishing features that are required for the prediction.10 This is particularly helpful to model the asthma dataset which contains multiple types of input features. The Rectified Linear Unit (ReLU) activation function is used because it provides nonlinear transformations for deep modelling.9

338e5732-6440-4402-8a09-e3a7375777f9_figure4.gif

Figure 4. DNNR architecture.

The following are the main equations used for prediction using the DNNR model:

x=[x1,x2,...,xm]w=[w1,w2,...,wm]f(e)={0fore<0efore0e=j=1mxw+by^=f(e)

where x is the input features, y^ is the predicted values, w is the input weights, b is the bias (a constant number used for adjustment), e is the internal elements in the hidden layers, f(e) is the activation function, m is the number of input features, and j is a constant number between [0, m].

Evaluation metrics

Evaluating the DNNR model is essential to determine its prediction error and accuracy, which can be achieved through Mean Absolute Error (MAE), Mean Squared Error (MSE), and Explained Variance Score (EVS). The MAE sums up the absolute difference between the actual and the predicted values. The MSE sums up the squared differences between the actual and the predicted values. The EVS computes the variance score which determines the accuracy of nonlinear regression models.9 The following equations calculate the MAE, MSE and EVS of the DNNR model:

MAE=1ni1n|y^iyi|MSE=1ni1n(y^iyi)2EVS=1v(y^y)v(y)

where y is the actual output values, y^ is the predicted values, n is the number of records in the dataset, v is the biased variance, and i is a constant number between [0, n].

Optimisation methods

Optimising the DNNR model is crucial for prediction with low error, high accuracy, and less computing time. This can be achieved by applying essential optimisation methods which include data scaling and parameter tuning. For data scaling, standardisation is used because it is beneficial for enhancing the performance of the DNNR model and its optimisation.9 This happens by rescaling the input and the output values using the following equations:

s=sμσμ=1ni1nsiσ=1ni1n(siμ)2

where s is the input/output variables, s′ is the standardised input/output values, μ is the mean of the input/output values, σ is the standard deviation of the input/output values, n is the number of records in the dataset, and i is a constant number between [0, n].

The DNNR parameters include hidden layers, nodes at each hidden layer, batch size, epochs, weight initialiser, loss function, and optimiser. Grid-search is an optimisation algorithm which automates the trial procedure of tuning these parameters and selecting their best values.21 Nevertheless, tuning a large number of parameters and their search values using grid-search leads to excessive computational time and power. In this study, the fragmented-grid-search method is used where parameters are tuned independently in parallel, hence taking less computing time for optimisation. Figure 5 demonstrates the optimisation algorithm and Figure 6 illustrates the overall optimisation process.

338e5732-6440-4402-8a09-e3a7375777f9_figure5.gif

Figure 5. Optimisation algorithm.

338e5732-6440-4402-8a09-e3a7375777f9_figure6.gif

Figure 6. Optimisation process.

Results

Regression results

Using the DNNR model on the dataset, a score of 0.83 is achieved with MAE = 1.44 and MSE = 3.62. Table 3 shows 5 predicted values against their actual values and Figure 7 contains the residual visualisation. It can be seen that the differences between the predicted and the actual values vary up to ±15. While this might seem an acceptable prediction error for some datasets, in the case of the asthma dataset, this amount of loss is unacceptable. This is because the range of the ACT score can only be from 5 to 25, where scores of 5 to 15 are categorised as “poorly-controlled asthma”, 16 to 19 as “not well-controlled asthma”, and 20 to 25 as “well-controlled asthma”.17

338e5732-6440-4402-8a09-e3a7375777f9_figure7.gif

Figure 7. DNNR Residual.

In the last row of Table 3, with the actual value of 19 (not well-controlled), the predicted value is 20 (well-controlled), which gives a contradictory prediction result. This can be a serious problem while providing tailored feedback to asthmatic patients, resulting in an insufficiently effective asthma self-management solution. For an optimised model, the acceptable loss range needs to be less than ±0.5. For example, with the actual value of 19, the prediction value can be at most 19.4≃19 (with maximum +0.4 loss) or at least 18.6≃19 (with maximum −0.4 loss). Therefore, an optimised-DNNR model is built to reduce the prediction error and increase the overall accuracy.

Table 3. Actual vs. predicted values.

Actual valuesGroupPredicted valuesGroup
12Poorly-controlled16Not well-controlled
25Well-controlled19Not well-controlled
17Not well-controlled22Well-controlled
21Well-controlled18Not well-controlled
19Not well-controlled20Well-controlled

Optimisation results

For the optimised-DNNR model, two hidden layers are used with 50 nodes at each hidden layer. Adaptive Moment Estimation (Adam) is used as the optimiser, which is helpful for optimising the learning and convergence rates during model training.13 Table 4 summarises the optimum parameter values obtained using fragmented-grid-search and the total tuning time. Figure 8 shows the loss rate of the training and the testing datasets swiftly decreased using the ReLU activation function.

338e5732-6440-4402-8a09-e3a7375777f9_figure8.gif

Figure 8. Loss function.

Table 4. Optimum parameters.

No. of hidden layers2
No. of nodes50
Batch size10
No. of epochs100
Loss functionMSE
OptimiserAdam
Weight initialiserNormal
Total tuning time26 minutes

Using the optimised-DNNR model on the dataset, a score of 0.91 was achieved with a total accuracy of around 94%. The MAE and the MSE rates are 0.20 and 0.09 respectively, which are in the acceptable loss range (error < 0.5). Figure 9 illustrates the residual plot of the optimised-DNNR model which shows a strong correlation between the predicted and the actual values. Figure 10 confirms that the optimised-DNNR model provides predictions within the loss range ±0.5.

338e5732-6440-4402-8a09-e3a7375777f9_figure9.gif

Figure 9. Optimised-DNNR prediction.

338e5732-6440-4402-8a09-e3a7375777f9_figure10.gif

Figure 10. Optimised-DNNR residual.

Discussion and conclusion

Recent popularity of mHealth and DNN enabled developing solutions to collect data from asthmatic patients and provide accurate predictive alerts. Although several studies support the association between weather and asthma, there is a lack of solutions for effective asthma self-management that can predict asthma exacerbation based on personalised weather triggers. This is due to three problems:

  • 1. Limited availability of real-time weather data that can link weather triggers with demography and asthma severity for individual asthmatic patients. This study obtained the dataset from the WEA application which comprises relevant input features (weather and demography) and target output (asthma severity).

  • 2. Existence of nonlinear relationships in the asthma dataset due to multiple types of input features and interconnected correlations. This study applied DNN for modelling the dataset, which effectively handles nonlinearity by using the ReLU activation function.

  • 3. Lack of accurate predictive models and precautionary frameworks for effective asthma self-management. This study built an optimised model that provides accurate predictions of asthma exacerbation with errors in the acceptable loss range (error < 0.5).

The experimental results reveal that the standardisation technique improves the stability of the DNNR model, which enhances the performance of the optimisation algorithm and the optimiser. Furthermore, the fragmented-grid-search method is able to tune several parameters with much less computing time (≈26 minutes) than the standard grid-search used in previous studies (e.g. ≈4.3 hours for tuning 2 parameters22). Moreover, model training takes less than one minute due to the Adam optimiser, which helps the model converge efficiently. Overall, the optimised-DNNR model provides predictions with a significantly higher accuracy rate (94%) than the existing ML models in the literature for predicting asthma exacerbation (e.g. 87% with naïve Bayes,2 85% through logistic regression,8 and 84% using random forest23).

Consequently, the optimisation process helps build an enhanced model for effective asthma self-management. Subsequently, the optimised model will be integrated into the WEA application for predicting asthma exacerbation based on personalised weather triggers and providing tailored feedback to users. The main limitation of this study is that the data was collected from a limited number of users and in one climate region. In future, more users from different climate regions will be considered for testing the generalisation capability of the proposed model.

Author contribution

RH and SBH conducted the research, analysed the data, and wrote the paper. IC and AA improved and edited the paper. All authors have approved the final version.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 10 Sep 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Haque R, Ho SB, Chai I and Abdullah A. Optimised deep neural network model to predict asthma exacerbation based on personalised weather triggers [version 1; peer review: 3 approved]. F1000Research 2021, 10:911 (https://doi.org/10.12688/f1000research.73026.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 10 Sep 2021
Views
11
Cite
Reviewer Report 02 Nov 2021
Kwan-Yong Sim, Swinburne University of Technology Sarawak Campus, Kuching, Malaysia 
Approved
VIEWS 11
The main contribution of the paper is to identify the correlation patterns among asthma, weather, and demographic variables. Optimization of the DNNR model by applying standardization and fragmented grid search successfully reduced the error rate below 0.5 required for effective asthma ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Sim KY. Reviewer Report For: Optimised deep neural network model to predict asthma exacerbation based on personalised weather triggers [version 1; peer review: 3 approved]. F1000Research 2021, 10:911 (https://doi.org/10.5256/f1000research.76644.r95805)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
8
Cite
Reviewer Report 22 Oct 2021
Meei Hao Hoo, Department of Internet Engineering and Computer Science, University of Tunku Abdul Rahman, Kajang, Malaysia 
Approved
VIEWS 8
The novelty of this study is to improve the accuracy of prediction of asthma by using optimised deep neural network and this gives a contribution to enhance the current model in terms of accuracy and computational performance. Most decisions in ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Hoo MH. Reviewer Report For: Optimised deep neural network model to predict asthma exacerbation based on personalised weather triggers [version 1; peer review: 3 approved]. F1000Research 2021, 10:911 (https://doi.org/10.5256/f1000research.76644.r94416)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
19
Cite
Reviewer Report 07 Oct 2021
Mark Elshaw, Computing. Electronics and Mathematics, Coventry University, Coventry, UK 
Approved
VIEWS 19
This an interesting paper, which is timely. There is suitable use of deep learning, however there is a need to consider whether the problem is big data. It is good that there is some consideration of preprocessing. The paper follows ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Elshaw M. Reviewer Report For: Optimised deep neural network model to predict asthma exacerbation based on personalised weather triggers [version 1; peer review: 3 approved]. F1000Research 2021, 10:911 (https://doi.org/10.5256/f1000research.76644.r95806)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 10 Sep 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.