Background

F1000Research

2046-1402

F1000 Research Limited

London, UK

10.12688/f1000research.140395.2

Research Article

Articles

COVID-19 Vaccine: Predicting Vaccine Types and Assessing Mortality Risk Through Ensemble Learning Algorithms

[version 2; peer review: 2 approved, 1 approved with reservations]

Monadhel

Hind

Formal Analysis Funding Acquisition Methodology Resources Software Visualization Writing – Original Draft Preparation https://orcid.org/0000-0002-9923-4979 a 1 Abbas

Ayad R.

Supervision Writing – Review & Editing https://orcid.org/0000-0002-9797-421X 2 Mohammed

Athraa Jasim

Supervision Writing – Review & Editing 3 1Computer Science, University of Technology - Iraq, Baghdad, Iraq 2Computer Science, University of Technology- Iraq, Baghdad, Iraq 3Computer Science, University of Technology- Iraq, Baghdad, Iraq

a cs.20.38@grad.uotechnology.edu.iq

No competing interests were disclosed.

23 8 2024

2023

1200

17 6 2024

2024

This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Background

There is no doubt that vaccination is crucial for preventing the spread of diseases; however, not every vaccine is perfect or will work for everyone. The main objective of this work is to predict which vaccine will be most effective for a candidate without causing severe adverse reactions and to categorize a patient as potentially at high risk of death from the COVID-19 vaccine.

Methods

A comprehensive analysis was conducted using a dataset on COVID-19 vaccine adverse reactions, exploring binary and multiclass classification scenarios. Ensemble models, including Random Forest, Decision Tree, Light Gradient Boosting, and extreme gradient boosting algorithm, were utilized to achieve accurate predictions. Class balancing techniques like SMOTE, TOMEK_LINK, and SMOTETOMEK were incorporated to enhance model performance.

Results

The study revealed that pre-existing conditions such as diabetes, hypertension, heart disease, history of allergies, prior vaccinations, other medications, age, and gender were crucial factors associated with poor outcomes. Moreover, using medical history, the ensemble learning classifiers achieved accuracy scores ranging from 75% to 87% in predicting the vaccine type and mortality possibility. The Random Forest model emerged as the best prediction model, while the implementation of the SMOTE and SMOTETOMEK methods generally improved model performance.

Conclusion

The random forest model emerges as the top recommendation for machine learning tasks that require high accuracy and resilience. Moreover, the findings highlight the critical role of medical history in optimizing vaccine outcomes and minimizing adverse reactions.

Classification algorithm COVID-19 Vaccine ensemble learning machine learning Sampling methods Side effects.

The author(s) declared that no grants were involved in supporting this work.

Revised Amendments from Version 1

All the reviewers' comments have been addressed, and a new section was added to clarify the algorithm and sampling method selection, as well as parameter tuning. Additionally, a new table and figure were included.

Abbreviations

COVID-19

Coronavirus Disease 2019

Decision Trees

LGBM

Light Gradient Boosting Machine

Machine Learning

mRNA

messenger ribonucleic acid

Random Forests

SARS-CoV

Severe Acute Respiratory Syndrome- associated coronavirus

SMOTE

Synthetic Minority Oversampling Technique

VAERS

Vaccine Adverse Event Reporting System

XGB

extreme Gradient Boosting Machine

Introduction

From seven to 13 years of research and development (R&D) and 1.8 million clinical trials to develop a vaccine in the past, we have transitioned to 10 to 18 months of R&D and tens of thousands of clinical trials to start vaccinating against COVID-19 in 2021. ¹

Vaccines are biologics that provide active adaptive immunity against particular diseases. The vaccine usually contains drugs similar to the microorganisms that cause the disease. It is generally made from one of the killed or attenuated micro-organisms, its toxins, or its surface proteins. Giving us an injection, nasal spray, or oral vaccine stimulates our immune system to recognize and destroy foreign bodies. ²

As a result of the novel coronavirus's rapid dissemination and disease burden, pharmaceutical companies and researchers were forced to create vaccinations quickly using either novel or preexisting technologies. ³ There are several different types of vaccines, and the purpose of each type is to boost your immune system and prevent serious, life-threatening diseases from occurring. ⁴ The COVID-19 vaccines that have been approved employ a variety of mechanisms of action, including mRNA, DNA vaccines, viral vectors, protein subunits, and virus-inactivated vaccination techniques. ⁵ Three vaccines have been widely administered: Pfizer and Moderna (mRNA) vaccinations targeting the SARS-CoV-2 surface protein, and the Janssen (viral vector) vaccine, which employed pre-existing technology with an adenovirus vector to trigger an immune response and provide protection against further infection. As these vaccines were developed using various approaches, they differ in efficacy and storage conditions. ⁶

However, no vaccine is entirely free from complications or adverse reactions. Any vaccination can have early adverse reactions, including local ones like pain, swelling, and redness, as well as systemic ones like headache, chills, nausea, fatigue, myalgia, and fever. ⁷ Also, several existing health conditions or symptoms the candidate already has can lead to severe adverse reactions after taking the COVID-19 vaccine. The candidate's death could be the worst-case scenario. As a result, it's critical to know about the candidate's previous medical history. ⁸

This paper delves into an in-depth analysis of adverse effects associated with COVID-19 vaccination using data mining techniques to predict the most appropriate vaccine for individual candidates and identify patients at high risk of mortality from COVID-19 vaccination. To accomplish these pivotal objectives, an extensive analysis was conducted using a comprehensive COVID-19 vaccine adverse reaction dataset, shedding light on crucial factors influencing vaccine outcomes.

This work's main contribution can be summarized as follows: 1.

Identify the most important features of an individual's medical history that could contribute to adverse reactions to vaccination.

Identify the most important features that contributed to the death of the candidate based on his or her medical history.

Address the challenge of the imbalanced dataset by employing sampling methods to effectively handle the imbalance and improve the reliability of the analysis.

Develop a machine learning (ML) model capable of predicting and classifying the most suitable vaccine types for each candidate, thus helping to prevent severe consequences and ensure optimal vaccination outcomes.

The rest of this paper is organized as follows: In the next section, we discuss a brief review of the literature on various related works. Section 3, provides a detailed explanation of our methodology and dataset. Section 4 discusses the study findings, while Section 5 covers the strengths and limitations. Section 6 presents the conclusions, and Section 7 outlines the future work.

Literature review

Due to the rapid advancement of technology, there are numerous opportunities and possibilities for ML in healthcare. ⁹ Classification is the most well-known machine-learning technique in medical applications because it is similar to everyday problems. A classification algorithm builds a model based on training data and then applies it to test data to obtain a prediction. ¹⁰

Interestingly, some studies have utilized machine learning applications to predict side effects, reactogenicity, and morbidity incidence following COVID-19 vaccinations. Research by Sujatha et al., ⁸ the authors developed a model to predict whether a candidate is suitable for COVID-19 vaccination. Four machine learning approaches, namely Logistic Regression, AdaBoost, Random Forest, and Decision Tree were employed in the task of prediction. The authors found that AdaBoost was the classifier with the best performance, achieving an accuracy of 0.98. The number of symptoms has been restricted to five for the sake of proper implementation. While this limitation streamlines the analysis process, it may overlook potential rare symptoms or nuances in symptomatology, impacting the comprehensiveness of the study's conclusions.

In research by Hatmal. M et al., ¹¹ the authors used machine learning and ensemble methods to predict the severity of side effects, defined as none, mild, moderate, or severe. The analysis revealed that random forest and XGBoost achieved the highest accuracy (0.80 and 0.79, respectively) and Cohen’s κ values (0.71 and 0.70, respectively). Statistical data analysis revealed that side effects significantly varied based on vaccine type. According to this study, the COVID-19 vaccines approved by the CDC are safe, and vaccination instills a sense of safety in people. However, severe cases may require additional medical care or even hospitalization. The dataset suffered from uneven gender and profession representation, potential result misclassification, and reliance on a self-reported online survey.

In research by Lian et al., ¹² the goal was to collect and analyze tweets about the COVID-19 vaccination to find posts about personal experiences with COVID-19 vaccine adverse events. The authors found that the ensemble model-based RF achieves the best performance with an F1 score of 0.926, an accuracy of 0.908, and a recall of 0.946. The named entity recognition (NER) model achieved an F1 score of 0.770 for detecting adverse events using the conditional random fields (CRF) algorithm. Also, the results show that the three COVID-19 vaccines' (Pfizer, Moderna, and Johnson & Johnson) most common side effects are soreness to touch, fatigue, and headache. Notably, the majority of the participants were young. Additionally, the survey was conducted in a single language, which may present challenges for individuals with Limited English Proficiency (LEP). This linguistic limitation could impact the inclusivity and representation of diverse perspectives in the study.

Methods

The overview of the general methodology for developing a machine learning models is visualized in Figure 1. In this study, we focus on predicting which vaccine will be most effective for a candidate without causing severe adverse reactions (output) based on several factors (input) and handling the imbalanced data that falls under the Pre-processing step where the data preparation process takes place.

Figure 1. Prediction methodology architecture. Dataset

The raw data of individuals who received vaccinations and reported adverse reactions was obtained from the VAERS. ¹³ This dataset contains vaccination information for individuals vaccinated against a variety of diseases including COVID-19, Polio, Tetanus, and Influenza. However, our current study omitted any non-SARS-CoV-2 (COVID-19) vaccination information. Therefore, the dataset being used consists of 49,810 individuals. This dataset has various attributes of individuals’ information such as age, gender, current illness, medical history, allergic history, type of vaccine, life-threatening illness, symptoms after vaccinations, etc. Some of these attributes have been found to be textual (e.g., medical history, symptoms text, etc.), while others have been found to be numerical (such as age, number of doses, etc.). The description of some different attributes in the VAERS data set is illustrated in Table 1.

Table 1. Description of some attributes in the VAERS dataset.

Number	Features	Description	Range	Mean	Standard Deviation
1	AGE_YRS (AS)	Age in years	16-109	57.13	18.43
2	SEX (S)	Sex information: (0: Female, 1: Male)	0-1	56.12	229.37
3	OTHER_MEDS (OM)	Other medications currently being taken	0-1	0.46	0.49
4	CUR_ILL(CL)	Illnesses at the time of vaccination	0-1	0.26	0.43
5	PRIOR_VAX(PV)	Any prior vaccination information	0-1	0.02	0.14
6	VAX_NAME (VN)	Vaccination name	0-2	0.46	0.49
7	Medical History (MH)	Pre-existing chronic or long-standing health conditions	0-1	0.46	0.49
8	ALLERGIES (A)	Any allergy history	0-1	0.25	0.43
9	Died (D)	Died	0-1	0.14	0.35

Preprocessing

The quality of raw data used to perform any analysis heavily influences its outcome. Therefore, the preprocessing and exploratory analysis of data becomes the most important parts of any data-driven investigation. The preprocessing of a dataset involved examining the data for missing values, irrelevant values, replicas, etc. whereas EDA assists in understanding data by visualizing it. It has been noticed that the dataset contains many missing and irrelevant values.

Any COVID-19 vaccine types that were not specified were removed, and only two types of values in the sex field were considered: “M” as male and “F” as female. Unknown values were excluded. In the died field, ‘Y’ was considered yes, and the rest were considered ‘no’; in the ‘prior vaccine’ field, ‘yes’ was considered yes, and the rest were considered ‘no’. The analysis of allergic history included considering mentioned allergic effects as positive cases and considering ‘null’, ‘none’, ‘NA’, and other negatively mentioned text as negative cases. The History column in the dataset contained written records of coexisting conditions, requiring the extraction of all of the patient's medical history separately. To better understand the patient's medical history, information about pre-existing chronic and non-chronic diseases, such as chronic obstructive pulmonary disease, hypertension, diabetes, and kidney disease, was extracted. All missing values (i.e., empty, null) were excluded from this field, and spelling/grammar mistakes were fixed.

In the Feature extraction step, most of the important features in the acquired dataset are presented as textual data. However, in order to analyze them, they must be separated into separate entities. As a result, String matching was used to convert all text data into attributes. The correlation plot ( Figure 2) did not demonstrate a significant relationship between various attributes and vaccine types. Yet, previous studies revealed a direct correlation between vaccine adverse reactions and medical and allergic histories. Therefore, the number of unique entries for the diseases in the patient's medical histories was counted. Diseases with more than 300 counts in patients' medical histories were considered attributes, while the rest were ignored due to the large dataset and the computational burden associated with each individual disease. This study, therefore, considered 21 diseases which are diabetes mellitus, thyroid, different pain, obesity, migraine, kidney disease, hypertension, hyperlipidemia, high cholesterol, heart disease, Gastroesophageal Reflux Disease (GERD), depression, dementia, positive history of COVID-19, Chronic Obstructive Pulmonary Disease (COPD), cancer, atrial fibrillation, asthma, arthritis, anxiety, and anemia from the patient’s medical history as attributes. Using the VAERS id, these files have been merged into one file after identifying and extracting features. The analyzed dataset has 28 different features and over 49,810 samples. The data was encoded using a one-hot encoding technique.

Figure 2. Correlation plot between different features of the VAERSA dataset. Data-Sampling Algorithms

In this study, only three methods of handling imbalanced data are used. In the first place, no changes are made to the data. Normally, it is divided into training and testing data at a ratio of 8 to 2. This first technique is referred to as “Normal” in this study. Next, experiments are conducted using well-known imbalanced data techniques called SMOTE, Tomek-links, and SMOTETOMEK, for balancing the dataset which combines SMOTE and Tomek links. ¹⁴ As with the previous experiment, the dataset is divided into training and testing data at a ratio of 8 to 2. This experiment aims to handle imbalanced data and further improve the performance of machine learning classification models, especially in the multiclass classification scenario. This experiment aims to handle imbalanced data and further improve the performance of machine learning classification models, especially in the multiclass classification scenario. These Sampling methods were selected considering data nature, imbalance ratio, algorithm compatibility, and analysis goals.

Description of Ensemble Methods

To predict which vaccine will be most effective for a candidate without causing severe adverse reactions (output) based on several factors (input), different machine-learning algorithms were used to build the proposed model. These approaches were selected due to their accuracy, robustness, efficiency, scalability, and ability to handle large, high-dimensional datasets while reducing overfitting.

Random Forest (RF)

A multipurpose data mining approach for classification. It is based on decision trees that operate as an ensemble, an approach of combining multiple classifiers to identify problems and enhance accuracy. A classification is predicted by each tree independently, and votes for the relevant class, and the majority of votes decide the model’s prediction. It can handle large dataset with high dimensionality, it also improves the accuracy of the model and eliminates the overfitting problem. ¹⁵

Decision Tree (DT)

A DT is a supervised learning technique that can be used for classification and regression problems; however, it is most commonly used to resolve classification issues. In this tree-organized classifier, the internal nodes represent datasets, branches represent decision rules, and each leaf node represents the outcome. A DT has two nodes: the decision node and the leaf node. The leaf nodes are the result of such decisions and they do not have any extra branches, but decision nodes are frequently used to settle any decision and have several branches. Based on the features of the dataset, decisions or tests are made. ¹⁶

Extreme Gradient Boosting (XGB)

XGBoost is an ensemble learning method combining multiple weak models' predictions to generate a stronger prediction. In the beginning, XGB fits the data to a weak classifier. Afterward, the data is fitted to another weak classifier to increase accuracy without affecting the current model. In the same way, the process continues until the best accuracy is achieved. ¹⁷ Furthermore, XGBoost supports parallel processing, making it possible to train models on large datasets in a reasonable period of time.

Light Grading Boosting Machine (LGBM)

LGBM is an open-source gradient boosting algorithm based on a tree-based learning framework; it is an open-source GBDT algorithm designed by Microsoft Research Asia. This framework grew trees vertically (leaf-wise) rather than horizontally (level-wise) as other tree-based frameworks did. Therefore, it can reduce the losses more efficiently and handle huge dataset with less computational complexity due to its lighter version. ¹⁸

Model performance evaluation

Macro average and Weighted Average are used to calculate the performance of the four classifiers used for learning.

In general, a confusion matrices are a 2 × 2 matrix. Where rows represent the instances in the actual class, and the columns represented the predicted class. It results in four possible outcomes: TP, FP, TN, and FN.

Figure 3. Confusion matrix examples.

(a) Binary classification confusion matrix. (b) Multiclass classification confusion matrix. ¹⁹

Using the above outcomes, we can check whether the predictions are correct. ¹⁶ ^, ²⁰ ^, ²¹ •

Accuracy: This term tells us how many classifications were correct out of all classifications. Accuracy = TP + TN TP + TN + FP + FN

•

Precision: A model's precision tells us how reliable its predictions are. Precision = TP TP + FP

•

Recall: The model's ability to detect class. Recall = TP TP + FN

•

F-score: It will give us a harmonic mean of precision and recall. F − score = 2 . precision · recall precision + recall

•

ROC Curve & AUC

ROC Curves show the performance of the classification model across all classification thresholds. In a ROC curve, the TP rate and FP rate are plotted at each threshold of classification. “AUC” stands for “Area Under the ROC Curve”. It can be used as a classifier to distinguish between classes. In general, the higher the AUC value, the better the classifier is at identifying positive from negative classes. ²² ^, ²³

Algorithms Hyperparameter Tuning Using Grid Search

Hyperparameter tuning is a crucial step in the data mining model development process, involving the refinement of hyperparameters within a data mining algorithm to uncover the optimal combination that enhances classifier performance. The Grid Search approach is a widely recognized and effective method for hyperparameter tuning. ²⁴ In the context of this paper, Grid Search was employed using the GridSearchCV object from scikit-learn to thoroughly explore and identify the hyperparameter set that consistently produces the most favorable results. Known for its systematic and methodical approach to hyperparameter tuning, Grid Search operates by specifying a set of hyperparameters and their potential values, creating a grid of all possible combinations, and assessing the model's performance for each. This method exhaustively searches through the grid, identifying the hyperparameters that consistently yield the best results, and fine-tunes the model for optimal performance.

A comprehensive overview of the default parameters for the data mining classifiers is provided in the appendix. Additionally, it details the parameters specifically assigned to each classifier for the purpose of randomized parameter optimization to enhance performance.

Results and Discussion

The majority of the individuals, 74% in total, were identified as female It was estimated that the average age of the individuals was about 53 years old and that the average age of those who died was about 72 years. Thus, there is a noticeable age difference between the two groups. In terms of the reported chronic diseases, chronic hypertension emerged as the most prevalent at 13%, followed by asthma at 12%. Kidney issues and anemia were reported in approximately 2% of the cases. Understanding the prevalence of pre-existing conditions is crucial in assessing the potential impact of the vaccine on individuals with specific health conditions. Additionally, a history of allergies, including various types of allergic events not limited to anaphylaxis, was frequently observed, representing approximately 20% of the total cases and close to 25% of the fatality cases ( Figure 6). According to reports, 10.7% of individuals who have received vaccinations have died. From Figure 5, one can clearly observe that the majority of COVID-19 vaccination case fatalities are concentrated among individuals aged 70 to 89, regardless of gender, highlighting this age group as particularly vulnerable to severe outcomes. Additionally, a significantly higher mortality rate is observed among males compared to females between the ages of 60 and 99, indicating that males within this age range are more susceptible to severe adverse effects leading to fatalities from the COVID-19 vaccination. These findings underscore the heightened vulnerability of elderly individuals, particularly those in their 70s and 80s, and emphasize the increased risk of severe outcomes among older males.

Figure 4. Case Fatality Number by Age Band and sex. Figure 5. Reported chronic diseases. Figure 6. Class label counts before and after applying the various data-sampling techniques for the vaccine type dataset.

The most frequently reported local and systemic side effects after each of the three available COVID-19 vaccinations include headache, pain at the injection site, rash, chills, fatigue, fever, body pain, and vertigo. These symptoms collectively accounted for nearly 10% of all observed cases, typically presenting as mild and transient, reflecting the body's natural immune response to vaccination. Additionally, several other adverse reactions were commonly reported, including various types of pain, nausea, rash at the injection site, muscle aches (myalgia), and shortness of breath. The incidence of these reactions ranged from 4.8% to 9.4%, while the occurrence of other adverse reactions was less than 5% ( Figure 7).

Figure 7. Top 29 frequently reported symptoms associated with the Moderna, Pfizer, and Janssen vaccines.

Extensive experiments have been conducted to predict three significant events in COVID-19 vaccination according to different scenarios. ML’s most relevant model to classify vaccines in each scenario includes RF, DT, XGB, and LGBM. We used 80% training data and 20% test data to evaluate the effectiveness of different ML-based approaches. As was previously mentioned, this dataset was unbalanced; therefore, We employed sampling strategies to address this problem. A number of well-known performance measures were used to assess the results of classification, including accuracy, precision, recall, F1 score, and ROC-AUC.

Our results are presented in two parts each with two scenarios: (a) multiclass classification with sampling, (b) binary classification with sampling, and (c) a comparison of the best model for each part.

Multiclass classification results: based upon both medical history and vaccine type

This section presents the results of the multiclass classification for covid-19 vaccine predicting problem, along with the analysis and the discussion. Firstly, we considered the patient’s medical history as independent features and the vaccine type (value 0 means Moderna, 1 means Pfizer, and 2 means Janssen) as dependent features that depend on the independent features. Then each of the three data-sampling procedures—SMOTE, TOMEK-LINKS, and SMOTETOMEK—was applied separately. Figure 6 illustrates the effects of applying various data-balancing techniques.

The performance parameters for each model on the test dataset are presented in Table 2. As a result, the following observations have been noted: •

The testing accuracy values range from approximately 75% to 81% across different models and methods. The Random Forest (RF) models with Normal, TOMEK-LINKS, and SMOTETOMEK methods achieved the highest testing accuracy of around 80.8%, while the XGBoost (XGB) and LightGBM (LGBM) models with Normal, SMOTE, and TOMEK-LINKS methods achieved slightly lower testing accuracy, ranging from 75.2% to 76.2%.

•

The training accuracy values are relatively close to the testing accuracy values, indicating that the models are not overfitting to the training data. The training accuracy values range from approximately 76.9% to 81.2%.

•

Macro Precision, Recall, and F1 Scores: These metrics provide insights into the models' performance for each class, and the macro averaging considers all classes equally. The RF and DT models consistently show similar precision, recall, and F1 scores across different methods, ranging from around 78.9% to 81.6%. The XGB and LGBM models tend to have slightly lower scores, ranging from approximately 70.5% to 74.6%. The RF models generally achieve the highest scores, while the XGB and LGBM models have the lowest scores.

•

The AUC (Area Under the Curve) values represent the performance of the models in terms of their ability to rank samples correctly across all classes. The AUC values range from approximately 78% to 85%. The RF models with SMOTE and SMOTETOMEK methods achieved the highest AUC values of around 85%, indicating better overall performance in distinguishing between different vaccine types.

•

Overall, the RF models consistently perform well across different methods, with relatively higher accuracy, precision, recall, F1 scores, and AUC values. The XGB and LGBM models have lower performance compared to RF and DT models. The SMOTE and SMOTETOMEK methods generally improve the performance of the models, as seen in higher AUC values compared to the Normal and TOMEK-LINKS methods. These models achieve relatively high testing accuracy, balanced precision, recall, and F1 scores, as well as high AUC values.

Table 2. Performance measures of multiclass classification.

Method	Model	Testing Accuracy	Training Accuracy	Macro Precision	Macro Recall	Macro F1 scores	AUC
Normal	RF	0.80823	0.81208	0.81569	0.78993	0.79653	0.84
	DT	0.80823	0.81210	0.81569	0.78993	0.79653	0.84
	XGB	0.76174	0.76480	0.78625	0.72301	0.74137	0.80
	LGBM	0.75218	0.75926	0.77682	0.70460	0.72229	0.78
SMOTE	RF	0.80317	0.80393	0.80226	0.80057	0.79300	0.85
	DT	0.80374	0.80397	0.80180	0.80102	0.79356	0.85
	XGB	0.75740	0.75748	0.74593	0.75048	0.74586	0.81
	LGBM	0.74953	0.75284	0.73641	0.74317	0.73613	0.80
TOMEK-LINKS	RF	0.80823	0.81208	0.81569	0.78993	0.79653	0.84
	DT	0.80823	0.81210	0.81569	0.78993	0.79653	0.84
	XGB	0.76174	0.76480	0.78625	0.72301	0.74137	0.80
	LGBM	0.75218	0.75926	0.77682	0.70460	0.72229	0.78
SMOTETOMEK	RF	0.80358	0.80451	0.80189	0.80089	0.79341	0.85
	DT	0.80374	0.80451	0.80180	0.80102	0.79356	0.85
	XGB	0.75901	0.75789	0.74595	0.75350	0.74740	0.82
	LGBM	0.75708	0.76176	0.74552	0.75530	0.74652	0.82

ROC curves have been used to further analyze the predictive capability of these developed models, which are shown in Figure 8. The RF and DT models prove their effectiveness. Taking AUC into account, all developed models perform satisfactorily.

Figure 8. ROC curves for covid-19 type multiclass classification. Binary classification results

In our model’s analysis, firstly, we considered the patient’s medical history as the independent features, and the vaccine type (value 0 means Moderna and value 1 means Pfizer) and the patient death (value 0 mean alive, and value 1 mean died) as dependent features. We trained and evaluated our models using test data by measuring accuracy, precision, recall, and AUC.

Scenario 1: Based upon both medical history and vaccine type

The performance parameters for each model on the test dataset are presented in Table 3. As a result, the following observations have been noted: 1.

RF achieved high testing accuracy (0.87091) and training accuracy (0.87439), indicating good generalization and low overfitting. It demonstrated high precision (0.87974), recall (0.87091), and F1 score (0.87424), suggesting a balanced performance between identifying positive and negative instances. The AUC (0.93) indicates a high discriminatory power of the model. The precision value for both RF and DT was reported as 0.87. XGB and LGBM also show a comparable precision value of 0.86 and 0.0.84, respectively.

DT achieved similar testing accuracy (0.86975) and training accuracy (0.87439) as RF. It showed slightly lower precision (0.8779), recall (0.86975), and F1 score (0.8728) compared to RF. The AUC (0.93) suggests a good ability to distinguish between positive and negative instances.

XGB achieved a slightly lower testing accuracy (0.85905) and training accuracy (0.86122) compared to RF and DT. It demonstrated comparable precision (0.86031), recall (0.85905), and F1 score (0.8596) to the testing accuracy, indicating a balanced performance. The AUC (0.91) suggests a reasonably good ability to discriminate between positive and negative instances.

LGBM showed the lowest testing accuracy (0.84953) and training accuracy (0.85038) among the models. It had slightly lower precision (0.84771), recall (0.84953), and F1 score (0.84857) compared to the other models. The AUC (0.89) suggests a good ability to distinguish between positive and negative instances, although it is lower than RF and DT.

The RF and DT models with vaccine-type target consistently achieved the highest accuracy, Recall, Precision, F1 score, and AUC, especially RF outperforms all others. XGB and LGBM models had slightly lower performance metrics but still maintained reasonable accuracy and AUC.

Thus, the experimental analysis recommends the RF model is the most suitable for detecting vaccine type compared to the other models.

Table 3. Experimental performance of Scenario 1 the models with binary vaccine type dataset.

Method	Model	Testing Accuracy	Training Accuracy	Precision	Recall	F1 scores	AUC
Normal	RF	0.87091	0.87439	0.87974	0.87091	0.87424	0.93
	DT	0.86975	0.87439	0.8779	0.86975	0.8728	0.93
	XGB	0.85905	0.86122	0.86031	0.85905	0.8596	0.91
	LGBM	0.84953	0.85038	0.84771	0.84953	0.84857	0.89

ROC curves have been used to further analyze the predictive capability of these models, which are shown in Figure 9. The RF and DT models prove their effectiveness. Taking AUC into account, all developed models perform satisfactorily.

Figure 9. Scenario 1: ROC Curve for binary vaccine type dataset.

Scenario 2: based upon both medical history and death

The patient’s death dataset was also experimented with as the vaccine-type dataset. Figure 10 demonstrates the effect of applying various data-sampling methods. The performance parameters for each model on the test dataset are presented in Table 4. As a result, the following observations have been noted: 1.

The testing accuracy values range from approximately 79.9% to 85.7%, depending on the model and method used. The RF and XGB models consistently achieve higher testing accuracy compared to DT and LGBM models. Among the methods, TOMEK-LINKS and SMOTETOMEK methods tend to show slightly lower testing accuracy compared to Normal and SMOTE methods.

The training accuracy values are relatively high, ranging from approximately 87% to 95.2%. However, there is a notable difference between the training accuracy and testing accuracy values, suggesting potential overfitting issues, especially for the RF models.

Precision, Recall, and F1 scores: The precision, recall, and F1 scores provide insights into the models' performance for predicting the positive class (death possibility). The RF models consistently achieve higher precision, recall, and F1 scores compared to DT, XGB, and LGBM models. Among the methods, TOMEK-LINKS and SMOTETOMEK methods tend to show slightly lower precision, recall, and F1 scores compared to Normal and SMOTE methods.

The AUC (Area Under the Curve) values represent the models' ability to rank samples correctly and discriminate between positive and negative classes. The AUC values range from approximately 66% to 86%. The RF and XGB models consistently achieve higher AUC values, indicating better overall performance in distinguishing between COVID-19 death possibilities.

the models trained on the normal data generally performed better in terms of accuracy and AUC compared to the models trained on the modified datasets (SMOTE, TOMEK-LINKS, SMOTETOMEK). The Random Forest, XGBoost, and LGBM models consistently showed good performance across the metrics in all datasets, indicating their robustness and effectiveness in classification tasks. The Decision Tree model had relatively lower performance, especially in terms of AUC, in all methods.

Figure 10. Class label counts before and after applying the various data-sampling techniques for the death dataset.

Table 4. Performance measures of different methods with death dataset.

Method	Model	Testing Accuracy	Training Accuracy	Precision	Recall	F1 scores	AUC
Normal	RF	0.84261	0.95272	0.82632	0.84261	0.83277	0.83
	DT	0.82917	0.95272	0.81553	0.82917	0.82151	0.66
	XGB	0.85700	0.93064	0.83523	0.85700	0.84072	0.86
	LGBM	0.85700	0.91312	0.83938	0.85700	0.84503	0.86
SMOTE	RF	0.81861	0.92647	0.84585	0.81861	0.82961	0.83
	DT	0.81285	0.92647	0.83576	0.81285	0.82256	0.70
	XGB	0.79942	0.88992	0.84183	0.79942	0.81556	0.81
	LGBM	0.806147	0.87468	0.85443	0.80614	0.82339	0.82
TOMEK-LINKS	RF	0.84261	0.95240	0.82632	0.84261	0.83277	0.83
	DT	0.83301	0.95240	0.82065	0.83301	0.82608	0.67
	XGB	0.84932	0.92920	0.83149	0.84932	0.83791	0.85
	LGBM	0.85508	0.92002	0.83641	0.85508	0.84236	0.86
SMOTETOMEK	RF	0.80326	0.9198	0.84520	0.80326	0.81909	0.82
	DT	0.78406	0.91983	0.82101	0.78406	0.79924	0.65
	XGB	0.79846	0.88383	0.84462	0.79846	0.81563	0.81
	LGBM	0.79846	0.86958	0.84880	0.79846	0.81665	0.82

ROC curves have been used to further analyze the predictive capability of these models, which are shown in Figure 11. The RF and DT models prove their effectiveness. Taking AUC into account, all developed models perform satisfactorily.

Figure 11. Scenario 2: ROC curve for a Death.

The importance of all the features in the COVID-19 vaccine adverse reactions dataset is calculated using the feature importance package from the Scikit-learn Python library. A visual representation of the calculated values for feature importance is displayed in Figure 12. The features are arranged based on their respective importance scores.

Figure 12. Ranking of features based on the patients' medical history coefficient values.

Figure 12 shows that patients' age, gender, and use of other medicines were significant factors in the past medical history of all target variables. WHEN examining the target variable of “vaccine type,” the analysis revealed a comprehensive set of critical attributes within the patient's medical history that strongly influence the selection of the administered vaccine. These attributes include previous vaccine history, allergic history, diabetes, arthritis, hypertension, and asthma. Furthermore, when investigating the target variable of death status, certain factors emerged as highly significant. These factors include heart disease, allergic history, dementia, hypertension, diabetes, kidney disease, and Chronic obstructive pulmonary disease (COPD). These attributes have shown a noteworthy impact on the desired outcome, indicating their importance in predicting the death status of patients.

The patient's age and gender provide essential demographic information that may impact the choice of vaccine, as certain vaccines have age or gender-specific recommendations. Additionally, considering the patient's current medication usage is crucial to ensure compatibility and potential interactions with the chosen vaccine. Previous vaccine history helps determine if the patient requires a booster or a specific type of vaccine.

The presence of underlying conditions such as diabetes, arthritis, allergic history, hypertension, and asthma is highly influential in the decision-making process. These conditions may affect the patient's immune response or make them more susceptible to certain vaccine side effects. By considering these attributes, healthcare professionals can tailor the vaccine type to maximize efficacy and minimize risks for each patient.

Comparing Proposal Models with Related Works

This section illustrates the comparison of our proposal model with the results of prediction models that are available in the related works. The comparison was structured around the methodologies employed and the achieved levels of accuracy. Table 5 presents the findings from four distinct studies on COVID-19 vaccine side effects. These studies utilized data from Twitter and the VAERS (COVID-19 World Vaccine Adverse Reactions dataset) spanning across different years. Each study employs a unique set of techniques to achieve specific objectives, resulting in varying degrees of accuracy.

Table 5. Comparing results of studies.

Authors	Dataset	Objective	Model	Accuracy (%)	Precision (%)	Recall (%)	F1-scores (%)
M.Ma’mon et al. [May 26, 2021]	Online survey	Predict the severity of side effects	XGB	0.79	-	-	-
			RF	0.80	-	-	-
			K*	0.44	-	-	-
			MLP	0.70	-	-	-
Lian et al. [Jan 11, 2022]	Twitter	Identify personal experiences	SVM	0.89	0.89	0.94	0.91
			LR	0.90	0.92	0.92	0.92
			RF	0.90	0.90	0.94	0.92
			Extra Trees	0.88	0.90	0.90	0.90
			GB	0.89	0.92	0.90	0.91
Sujatha et al. [Oct 26, 2021]	VAERS	Predict suitability for vaccination	LR	0.97	0.88	0.90	0.89
			RF	0.97	0.89	0.95	0.92
			AdaBoost	0.98	0.89	0.97	0.93
			DT	0.97	0.91	0.87	0.89

Strengths and limitations

As far as the authors are aware, this is the first study that attempts to predict the type of covid-19 vaccine appropriate for a candidate, along with the death probability risk. Additionally, we suggest approaches to address the issue of imbalanced data concerning adverse reactions to COVID-19 vaccines.

This study has some limitations. Because these data were collected online, we cannot rule out information-gathering bias in the study. Moreover, this data set contained a significant amount of missing data, which may lead to a misrepresentation of patient populations.

Conclusion and future works Conclusion

In this work, four ML models were evaluated: DT, RF, XGBoost, and LGBM. Three sampling techniques were executed for each model to handle imbalanced data. Below are some of the key findings of the study, which shed light on crucial insights and implications: 1.

The tree-based model RF presented the best overall results with multiclass classification.

The SMOTE and SMOTETOMEK methods generally improve the performance of the models, as seen in higher AUC values compared to the Normal and TOMEK-LINKS methods.

For binary classification in scenario 1, the experimental analysis recommends the RF model as the most suitable for detecting vaccine type compared to the other models.

In scenario 2, the RF, XGBoost, and LGBM models consistently showed good performance across the metrics in all methods, indicating their robustness and effectiveness in classification tasks.

The Decision Tree model had relatively lower performance, especially in terms of AUC, in all methods.

The result revealed that patient age, gender, allergic history, prior vaccine, other medicines, diabetes, hypertension, and heart disease are significant pre-existing factors that strongly influence the selection of the administered vaccine.

According to the study's results, the RF model is recommended for machine learning tasks that demand high accuracy and robustness. While both the XGBoost and LGBM models are also viable options, the RF model could be preferable when dealing with imbalanced data. The effectiveness of these balancing algorithms has been evaluated, leading to the conclusion that no single technique can consistently produce the best results across all datasets. When considering the importance of data distribution, machine learning techniques and balancing algorithms are both crucial.

Future Works

The findings of this study can be extrapolated to various other datasets related to vaccinations. While the inclusion of medical history features was restricted due to the substantial size of the dataset and the computational complexities associated with processing each disease, there is room for further advancement. By automating the system, its capability to analyze predictions based on a broader spectrum of medical history features can be enhanced. As new data streams into the dataset, fresh predictions can be dynamically generated by this automation, considering the prevailing factors at that specific moment. Additionally, the integration of deep learning methodologies presents an opportunity to uncover latent patterns within the data, thereby enhancing comprehension of the intricate dynamics governing COVID-19 vaccine acceptability. This multifaceted approach is poised not only to augment predictive accuracy but also to deepen the understanding of the nuanced interplay between medical history, vaccination patterns, and evolving epidemiological dynamics.

Data availability

The dataset used to support the findings of this study is available at the following: https://vaers.hhs.gov/data/datasets.html.

The dataset is comprised of three CSV files, namely VAERSDATA, VAERSVAX, and VAERSSYMPTOMS. Within these datasets, VAERSDATA provides comprehensive information regarding individuals, VAERSVAX offers details related to vaccines, encompassing vaccination type, manufacturer, dosage count, and vaccination location, and VAERSSYMPTOMS catalog symptoms reported as various illnesses following vaccinations.

[VAERS Data]: https://vaers.hhs.gov/eSubDownload/index.jsp?fn=2021VAERSDATA.csv.

[VAERS Vaccine]: https://vaers.hhs.gov/eSubDownload/index.jsp?fn=2021VAERSVAX.csv.

[VAERS Symptoms]: https://vaers.hhs.gov/eSubDownload/index.jsp?fn=2021VAERSSYMPTOMS.csv.

References 1

Velásquez

: Vaccines, Medicines and COVID-19: How Can WHO Be Given a Stronger Voice? Springer Nature;2022;117.

Dai

Xiong

: Vaccine types. Vaccines-the History and Future. IntechOpen;2019; (pp.1–18).

Eroglu

Nuwarda

Ramzan

: A Narrative Review of COVID-19 Vaccines. Vaccines. 2021;10(1):62. 35062723

10.3390/vaccines10010062

PMC8779282

Monadhel

Abbas

Mohammed

: COVID-19 vaccinations and their side effects: a scoping systematic review [version 1; peer review: awaiting peer review]. F1000Res. 2023;12:604. 10.12688/f1000research.134171.1

Vitiello

Ferrara

: Brief review of the mRNA vaccines COVID-19. Inflammopharmacology. 2021;29(3):645–649. 33932192

10.1007/s10787-021-00863-6

PMC8087898

Patel

Kaki

Potluri

: A comprehensive review of SARS-CoV-2 vaccines: Pfizer, Moderna & Johnson & Johnson. Hum. Vaccin. Immunother. 2022;18(1):2002083. 35130825

10.1080/21645515.2021.2002083

PMC8862159

Al Khames Aga

Alkhaffaf

Hatem

: Safety of COVID-19 vaccines. J. Med. Virol. 2021;93(12):6588–6594. 34270094

10.1002/jmv.27304

PMC8426829

Sujatha

Venkata Siva Krishna

Chatterjee

: Prediction of suitable candidates for COVID-19 vaccination. Intell. Autom. Soft Comput. 2022;32(1):525–541. 10.3233/JIFS-202714

Javaid

Haleem

Singh

: Significance of machine learning in healthcare: Features, pillars and applications. Int. J. Intell. Networks. 2022;3:58–73. 10.1016/j.ijin.2022.05.002

Zoumana

KEITA

: “Classification in Machine Learning: An Introduction”,datacamp. Sep 2022. Reference Source

Hatmal

MMM

Al-Hatamleh

Olaimat

: Side effects and perceptions following COVID-19 vaccination in Jordan: a randomized, cross-sectional study implementing machine learning for predicting severity of side effects. Vaccines. 2021;9(6):556. 34073382

10.3390/vaccines9060556

PMC8229440

Lian

Tang

: Using a machine learning approach to monitor COVID-19 vaccine adverse events (VAE) from twitter data. Vaccines. 2022;10(1):103. 35062764

10.3390/vaccines10010103

PMC8781534

VAERS Data Sets: Reference Source

Henry

: Imbalanced Classification in Python: SMOTE-Tomek Links Method.

Reference Source

Random Forest Algorithm: Java T point. 2018. Reference Source

Decision Tree Classification Algorithm GeeksforGeeks.

08 May, 2023. Reference Source

“XGBoost ML Model in Python” JavaTpoint, [online]. Reference Source

Banerjee

: LightGBM Classifier in Python Kaggle. 2021. Reference Source

Grandini

Bagli

Visani

: Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. 2020.

Markoulidakis

Kopsiaftis

Rallis

: Multi-Class Confusion Matrix Reduction method and its application on Net Promoter Score classification problem. In The 14th pervasive technologies related to assistive environments conference. 2021, June:412–419.

Abbas

Farooq

: Skin Detection Using Improved ID3 Algorithm. Iraqi J. Sci. 2019;402–410.

Narkhede

: Understanding AUC - ROC Curve Medium. Jun 26, 2018. Reference Source

Abbas

Kareem

: Age estimation using support vector machine. Iraqi J. Sci. 2018;1746–1756.

Belete

Huchaiah

: Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results. Int. J. Comput. Appl. 2022;44(9):875–886.

10.5256/f1000research.167975.r316661

Reviewer response for version 2

Bandyopadhyay

Samir Kumar

1 Referee https://orcid.org/0000-0002-4868-3459 1The Bhawanipur Education Society College, Kolkata, India

Competing interests: No competing interests were disclosed.

5 9 2024

2024

This is an open access peer review report distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

recommendation

approve

After reviewing the revision made by the authors I am satisfied with the revision.

Is the work clearly and accurately presented and does it cite the current literature?

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Yes

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Are sufficient details of methods and analysis provided to allow replication by others?

Reviewer Expertise:

Image Processing, Forensic Science, Machine Learning, Crime Science, Data Science, etc.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

10.5256/f1000research.153740.r233060

Reviewer response for version 1

Bandyopadhyay

Samir Kumar

1 Referee https://orcid.org/0000-0002-4868-3459 1The Bhawanipur Education Society College, Kolkata, India

Competing interests: No competing interests were disclosed.

31 5 2024

2024

recommendation

reject

Authors:

Q1. Specify the imbalanced dataset with an example.

Q2.What type of Illnesses can be seen at the time of vaccination?

Q3. Explain EDA since otherwise it is difficult to understand the pre-processing steps.

Q4 Pre-existing chronic disease is required to study and it must contain fever nature.

Q5.The process of feature extraction is not provided accurately.

Q6. What is the difference between attribute values and attribute text?

Q7. There are no comparisons with existing methods.

Editor:

Authors studied to predict which vaccine will be most effective for a candidate without causing severe adverse reactions and to categorize a patient as potentially at high risk of death from the COVID-19 vaccine. In the Feature extraction step, most of the important features in the acquired dataset are presented as textual data. However, in order to analyze them, they must be separated into separate entities. All important parts are not explained properly.

Is the work clearly and accurately presented and does it cite the current literature?

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Yes

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Are sufficient details of methods and analysis provided to allow replication by others?

Reviewer Expertise:

Image Processing, Forensic Science, Machine Learning, Crime Science, Data Science, etc.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Monadhel

Hind

Computer Science, University of Technology Baghdad, Baghdad, Baghdad, Iraq

Competing interests: Non

14 6 2024

Dear Samir,

Thank you for dedicating your time and effort to review our manuscript. Your insightful comments and suggestions have provided valuable guidance for improving our work.

While we are disappointed by the decision, we genuinely appreciate the constructive feedback that will undoubtedly strengthen our future research efforts.

Thank you once again for your thorough review.

10.5256/f1000research.153740.r245449

Reviewer response for version 1

Jinran (Ryan)

1 Referee https://orcid.org/0000-0002-2388-3614 1Australian Catholic University, Fitzroy, Victoria, Australia

Competing interests: No competing interests were disclosed.

3 5 2024

2024

recommendation

approve-with-reservations

The authors proposed predicting vaccine types and assessing mortality risk through some ensemble learning approaches. The research topic is interesting, and some comments are given for further consideration.

1. Based on Google Scholar, authors missed citing many recent references that shouldn't be ignored. Here, I suggest using a table to list and compare their main points to highlight your contributions to the area.

2. For the experiment part, all parameter settings are missing, so that the results declared cannot be repeated. Please release your codes or detailed settings in the appendix.

3. For your results, I suggest authors use a cross-validation approach to evaluate the uncertainty of the predictions.

4. Also, considering the imbalance issue, the authors shall consider different penalties for different prediction errors. In other words, we cannot regard "dies" as the same as "non-dies". Authors shall distinguish different losses.

5. The discussion could have been improved. The authors shall further explore the underlying implications in this part. Otherwise, the work looks like a mathematical game. In particular, the authors shall connect results to some findings from some top medicine journals.

6. The authors shall use a professional writing service to make the content clear.

Is the work clearly and accurately presented and does it cite the current literature?

Partly

If applicable, is the statistical analysis and its interpretation appropriate?

Partly

Are all the source data underlying the results available to ensure full reproducibility?

Partly

Is the study design appropriate and is the work technically sound?

Partly

Are the conclusions drawn adequately supported by the results?

Partly

Are sufficient details of methods and analysis provided to allow replication by others?

Partly

Reviewer Expertise:

Machine learning, Forecasting, Applied Statistics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

Monadhel

Hind

Computer Science, University of Technology Baghdad, Baghdad, Baghdad, Iraq

Competing interests: Non

14 6 2024

Dear Jinran Wu,

Thank you for your valuable feedback and suggestions. We have carefully addressed all your comments and revised the manuscript accordingly. Specifically, regarding your suggestion " 3. For your results, I suggest authors use a cross-validation approach to evaluate the uncertainty of the predictions," we will upload the experimental code to ensure clarity.

We appreciate your thorough review and believe these revisions have significantly enhanced the quality of our work.

10.5256/f1000research.153740.r257042

Reviewer response for version 1

Ghosh

Aritra

1 Referee https://orcid.org/0000-0002-9204-0434 1Computer Science, Sister Nivedita University, Kolkata, West Bengal, India

Competing interests: No competing interests were disclosed.

28 3 2024

2024

recommendation

approve

Full Report:

Introduction: The introduction provides a comprehensive overview of the transition in vaccine development timelines and the urgent need for effective COVID-19 vaccines. It highlights the significance of understanding vaccine mechanisms and adverse reactions, setting the stage for the proposed machine-learning framework. However, it could benefit from a succinct statement of the study's objectives to guide readers through the subsequent sections more effectively.

Literature Review: The literature review effectively contextualizes the study within existing research on machine learning applications in healthcare, specifically focusing on COVID-19 vaccine prediction. It provides insights into relevant studies while emphasizing the novelty and contributions of the current work. However, the review could be strengthened by discussing potential limitations or gaps in previous research, thereby justifying the need for the proposed study more explicitly.

Methods: The methods section is detailed and well-structured, outlining the data preprocessing, feature extraction, and modeling techniques employed. It effectively communicates the rationale behind each step and provides clarity on the experimental design. The inclusion of figures and tables enhances the understanding of complex methodologies. However, providing more information on the rationale behind the selection of specific sampling techniques and model evaluation metrics would strengthen the methodology further.

Results and Discussion: The results and discussion section presents comprehensive findings from the study, including performance metrics and an analysis of key features. The results are effectively communicated through tables, figures, and textual descriptions, facilitating interpretation. The discussion contextualizes the findings within the broader literature and highlights implications for vaccine selection and adverse reaction prediction. However, a more structured approach to discussing limitations and future research directions would enhance the clarity of the discussion.

Conclusion and Future Works: The conclusion summarizes the key findings and implications of the study while outlining potential avenues for future research. It effectively emphasizes the significance of the study's contributions and underscores the importance of continued research in this area. However, providing more specific recommendations for addressing identified limitations and mitigating potential biases would enhance the conclusion's comprehensiveness.

Overall Assessment: The abstract and full report provides a detailed and insightful analysis of the proposed machine-learning framework for COVID-19 vaccine prediction. The study demonstrates a rigorous approach to data analysis and model evaluation, yielding valuable insights into vaccine efficacy and adverse reactions. Addressing the following minor issues would further enhance the scientific soundness and readability of the report:

Clarity and Readability:

Ensure consistent terminology throughout the report. For example, use either "COVID-19 vaccine" or "SARS-CoV-2 vaccine" consistently instead of switching between them.

Consider breaking down lengthy paragraphs into shorter ones for easier readability and comprehension, especially in sections like "Results and Discussion" and "Conclusion and Future Works."

Provide clear transitions between sections to guide the reader through the report more effectively. Each section should flow logically from one to the next.

Justification:

Provide more justification for the choice of machine learning algorithms. Explain why Random Forest (RF), Decision Tree (DT), Extreme Gradient Boosting (XGB), and Light Grading Boosting Machine (LGBM) were selected over other algorithms. Justify why these algorithms are suitable for the task at hand.

Clarify the reasoning behind choosing specific data sampling techniques (e.g., SMOTE, Tomek-links, SMOTETOMEK) to handle imbalanced data. Explain why these techniques were deemed appropriate and how they contribute to improving model performance.

Structure:

Consider refining the structure of the report to make it more cohesive and organized. For instance, ensure that each section has a clear and specific focus, with subheadings to delineate different topics within the section.

Provide a brief overview or summary at the beginning of each section to outline the main points that will be discussed. This will help readers understand the purpose and scope of each section more clearly.

In the "Conclusion and Future Works" section, provide a concise summary of the key findings and implications of the study. Additionally, offers specific suggestions for future research directions based on the limitations or areas for improvement identified in the study.

By addressing these minor issues, the report will become more scientifically sound and easier to follow for readers, thereby enhancing its overall quality and impact.

Is the work clearly and accurately presented and does it cite the current literature?

Yes

If applicable, is the statistical analysis and its interpretation appropriate?

Yes

Are all the source data underlying the results available to ensure full reproducibility?

Yes

Is the study design appropriate and is the work technically sound?

Yes

Are the conclusions drawn adequately supported by the results?

Yes

Are sufficient details of methods and analysis provided to allow replication by others?

Yes

Reviewer Expertise:

Data Analysis, AI and ML, HCI, Computational Modeling and Big Data, and Web Development. Cuurently working on the application of Machine Learning for COVID-19 vaccine development.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Monadhel

Hind

Computer Science, University of Technology Baghdad, Baghdad, Baghdad, Iraq

Competing interests: Non

14 6 2024

Dear Aritra Ghosh,

Thank you for your valuable feedback and suggestions. We have carefully addressed all your comments and have revised the manuscript accordingly.

We appreciate your thorough review and believe the changes have significantly improved the quality of our work.

Hind M.Abdullah