X-ray versus computerized tomography (CT) images for detection of COVID-19 using deep learning [version 1; peer review: awaiting peer review]

Background: The recent outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the disease corresponding to it (coronavirus disease 2019; COVID-19) has been declared a pandemic by the World Health Organization. COVID-19 has become a global crisis, shattering health care systems, and weakening economies of most countries. The current methods of testing that are employed include reverse transcription polymerase chain reaction (RT-PCR), rapid antigen testing, and lateral flow testing with RT-PCR being used as the golden standard despite its accuracy being at a mere 63%. It is a manual process which is time consuming, taking about an average of 48 hours to obtain the results. Alternative methods employing deep learning techniques and radiologic images are up and coming. Methods: In this paper, we used a dataset consisting of COVID-19 and non-COVID-19 folders for both X-Ray and CT images which contained a total number of 17,599 images. This dataset has been used to compare 3 (non-pre-trained) CNN models and 5 pre-trained models and their performances in detecting COVID-19 under various parameters like validation accuracy, training accuracy, validation loss, training loss, prediction accuracy, sensitivity and the training time required, with CT and X-Ray images separately. can be concluded that X-ray images provide a higher accuracy in detecting COVID-19 making it an effective method for detecting COVID-19 in real life.


Introduction
The recent coronavirus disease 2019  pandemic instigated by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has become a global catastrophe with 257,469,528 registered cases and 5,158,211 deaths registered worldwide as of 23rd November 2021 [WHO Coronavirus  Dashboard, Accessed on 24th November 2021]. The virus is primarily spread in 3 different ways-inhaling air that has been contaminated with fine droplets and or aerosol particles consisting of the virus, deposition of exhaled droplets that contain the virus onto exposed mucous membranes which may be the mouth, nose or the eyes, and touching mucous membranes after coming in contact with contaminated or soiled surfaces and or objects [CDC, Accessed on 24th November 2021]. Therefore, it is essential to maintain and/or improve hygienic habits such as washing one's hands for at least 20 seconds, and sanitizing hands with sanitizers that contain a minimum alcohol content of 60%. 1 Avoiding extended stays in enclosed spaces with improper ventilation is also crucial to protect oneself [CDC, Accessed on 24th November 2021]. An individual contracting the disease may depict a wide range of symptoms: 83%-98% of the patients have fever, 59%-78% of the patients exhibit dry cough and 3%-29% of the affected patients may require admission into an intensive care unit (ICU) for the management of a myriad of complications, which include extreme pneumonia, hypoxemic respiratory failure, acute respiratory distress syndrome (ARDS), multiple organ failure etc. 2 On the other hand, it is observed that a small percentage of the affected patients presented no symptoms. 2 Affected individuals that exhibit symptoms are often diagnosed to have COVID-19 using the reverse transcription polymerase chain reaction test, commonly referred to as the RT-PCR test. Samples are collected from the nasal and or pharyngeal passages or the throat. Sputum is also often used as a sample. Genetic material is isolated from the sample and multiplied. Fluorescent indicators are added to the multiplied genetic material that glow in the presence of the SARS-CoV-2's genetic material [Cleveland Clinic, Accessed on 24th November 2021]. Despite being the gold standard of testing, it provides an average accuracy rate of 63% [3][4][5] as the detection of the virus's genetic material depends on when the patient takes the test and the amount of the virus present in the sample collected. A recovered patient may continue to get a positive RT-PCR test result due to small amounts of genetic material being present, even though they have recovered and cannot spread the virus. On the other hand, a recently infected patient may receive a negative test result due to smaller amounts of virus present in the sample [Cleveland Clinic, Accessed on 24th November 2021].
To overcome these drawbacks, doctors and scientists alike are researching and coming up with various other methods to diagnose COVID-19 that provide higher accuracy rates. One such method is using radiological images and deep learning techniques. 6,7 Often, the imaging modality used for this purpose is computerized tomography (CT) [8][9][10][11] and/or X-ray. [12][13][14][15] It is found that COVID-19 patients depict similar features on radiographic images such as bilateral, multifocal ground glass opacities with peripheral or posterior distribution. This is predominantly found in the lower lobes of the lungs in the early stages. In the later stages, these features can be found in areas of pulmonary consolidation. 16 Unfortunately, these symptoms overlap with the symptoms of bronchopneumonia, hence, making it difficult for the radiologists to tell COVID-19 apart from other diseases that may affect the lungs. 17 Deep learning techniques are used, in hopes of revealing underlying features, that often go unnoticed which may aid in confirming diagnosis.
There have been several studies that have employed deep learning methods to diagnose COVID-19 from radiological images-a total of 8 studies with 4 using CT and 4 using X-ray as their imaging modalities. They have been presented in a consolidated form in Table 1 and Table 2.
Past works have used either X-ray image or CT images to train and test their models but not both. There are no studies conducted, to the best of our knowledge, that compare the performances of the models when they are trained with CT and X-ray images. This paper presents a comparative study using 3 non-pre-trained models and 5 pre-trained models to find which imaging modality provides better accuracies in detecting COVID-19 using deep learning techniques.

Methodology
This section elucidates the dataset used, the kind of network employed and the working of this study that took place from January 2021 to July 2021. The general workflow used in the study is shown in Figure 1. Non-pre-trained models were developed for the study and were rejected due to their low validation accuracies. To compare the results and efficacy and to deduce the best of the two imaging modalities, 5 pre-trained models namely VGG16, VGG19, ResNet50, InceptionV3, and Exception were loaded and fine-tuned according to the needs of this study. The pre-trained models that were selected were chosen primarily based on their accuracy (where higher accuracy yields better results), the speed of the model training and predictions (where faster training is preferred). Ideally, we would use a model that is smaller in size with a higher accuracy and lower time requirement but realistically any model that has deeper layers would make it slower to execute and a smaller model would have much lower accuracy. Hence, we have experimented with the available models to find the best suited option to detect COVID-19.
For the purposes of training our networks, the dataset named 'Extensive COVID-19 X-Ray and CT Chest Images Dataset', curated and augmented by Walid El-Shafai and Fathi Abd El-Samie was used. 18 The dataset consists of COVID-19 and Non-COVID-19 folders for both X-ray and CT images. There were 5500 non-COVID-19 and 4044 COVID-19 cases belonging to the X-ray folder. For CT images, there were 2628 non-COVID-19 and 5427 COVID-19 images, which brings the total number of images to 17,599. Sample images from the Extensive COVID-19 X-Ray and CT Chest Images Dataset is shown in Figure 2 Evaluated the performance of EfficientCovidNet model in 3 scenarios using 5 cross-validation-• Random, • Slices and Voting.
• The random approach works the best. 9 The model has an accuracy of 87%. 9 A. K. Mishra et al. (2020) Decision fusion used to reduce the mistakes made by the models. Done by combining individual predictions using the majority voting approach. 10 The decision fusion model outperforms all other pre-trained models with an accuracy of 88.34%, AUC of 88.32% and a F1 score of 86.7%. 10 • CTnet-10 has two convolutional blocks and a pooling layer combined with a flattening layer.
• The dropout used is of 0.3 • The accuracy was increased by fine-tuning and image augmentation. 11 Model CTnet-10 has an accuracy of 82.1% and the training time is 130s, the testing time is 0.9s and the execution time is 0.01233s. 11 A convolutional neural network (CNN) is one of the many networks that fall under neural networks. It primarily works on the basis that the input will be an image or is made up of images. Therefore, the model is extensively employed in the field of pattern recognition in images, making it an effective method for this study.
In this study, Keras API (version2. 4.3), an open-source software library providing an interface for neural networks, is used to build and train the models. We trained and tested a total of 8 models-3 models that were built from scratch and 5 pre-trained models. Each model was trained with the CT dataset and the X-ray dataset separately and the obtained results were compared. Description of each of the models are mentioned below. The code is available from GitHub and archived with Zenodo. 21 Model 1: Consists of 4 alternate convolution and max pooling layers, 32 filters, 4 Â 4 kernels and Relu activation function present in each convolution layer. The model consisted of 5 epochs and has 23,618 total parameters, all of which are trainable.    Figure 3 shows the model architecture that was used for this study.
VGG 19: The difference between VGG-16 and VGG-19 is the number of layers -16 and 19 layers, respectively. The architecture of this network is shown in Figure 3. Compared to VGG16, VGG19 performs slightly better but it requires more memory due to more layers [Understanding the VGG19 Architecture, Opengenus Foundation, Accessed on 24th November 2021]. 19 The model has 20,074562 parameters out of which 50,178 are trainable and 3 epochs. Figure 4 shows the model architecture that was used for this study.      Once the models were trained, parameters such as number of epochs, training accuracy and loss, validation accuracy and loss, true positive and negative, false positive and negative, prediction accuracy, sensitivity, and training times were recorded.

Procedure
The open-source dataset, Extensive COVID-19 X-Ray and CT Chest Images Dataset, was downloaded. The dataset contained 5500 non-COVID-19 and 4044 COVID-19 cases belonging to the X-ray folder and there were 2628 non-COVID-19 and 5427 COVID-19 images belonging to the CT folder, which brings the total number of images to 17,599. The images were divided into training, testing and validation sets. For each imaging modality, 20 COVID-19 and 20 non-COVID-19 images were set aside for the testing set. About 1860 images of both COVID-19 and non-COVID-19 were initially kept aside for validation for each of the imaging modalities. Upon experimenting with the number of images in the validation sets, it was found that validation sets consisting of 2500 images of both the cases for each modality resulted in the better validation accuracies. The remaining images were used for training the various models. The above process was repeated with the X-ray training and validation sets. Models 1, 2 and 3 showed validation accuracies of 31.32%, 29.87% and 29.87% respectively.
Due to poor validation accuracies, pre-trained models were used in hoped of obtaining higher validation and testing accuracies. The top layers of the models were dropped from the pre-trained models to make it more suited for the study's objective. Preprocessing layers were also added to the front of certain models.
Once the models are trained, validated, and tested, parameters such as the number of epochs, training accuracy and loss, validation accuracy and loss, true positive and negative, false positive and negative, prediction accuracy, sensitivity, and training times were all recorded and tabulated.

Statistical analysis
There was no statistical analysis conducted on features during the period of the study.

Results
The results obtained during our study have been tabulated and categorized in terms of imaging modalities. Table 3 presents the tabulated performance of the CNN models when CT images are used. Similarly, Table 4 presents the performance of the CNN models when X-ray images are used. The number of epochs, various losses and accuracies are mentioned in both Table 1 and Table 2. The training times are also mentioned.
The 3 non-pretrained, CNN models were first employed to gain more exposure to how the network functions and how we can alter it to increase accuracy rates. From Tables 3 and 4, it can be observed that both Models 1 and 2 have 5 epochs while Model 3 has 10. Among Models 1 and 2, it can be noted that Model 2 has a higher training accuracy [79.52% for CT images and 84.63% for X-ray images] that is due to the presence of more convolutional layers and increased filter size.  Figure 8 is a comparison of the 3 non-pretrained models based on training accuracy. While the CNN models provided high training accuracies, their validation accuracies were not satisfactory. For deep learning techniques to be employed in a clinical setting to detect COVID-19, the validation accuracies must be high. Hence, pre-trained networks were chosen and trained with our dataset. Table 5 and 6 show the performances of the 5 pre-trained models when they are trained and tested using CT images and X-ray images, respectively.
For the pre-trained models, the comparison has mostly been based on validation accuracies. Validation accuracies signify the generalizability of the model to detect COVID-19, in this scenario, from the dataset that was held back from the model while training. Another important parameter used is sensitivity. This gives us the ratio of actual positives that been predicted correctly by the model. The epochs of the training were decided to be set at 3 after experimentation -a higher number of iterations did not result in higher validation accuracies and smaller numbers of iterations result in lower accuracy rates. Figure 9 compares the obtained validation accuracies of the 5 pre-trained models for both the imaging modalities.

Discussion
The obtained results are compared to the highest obtained results from previous works for each of the imaging modality. These comparisons are done based on the imaging modality and parameters such as validation accuracy, sensitivity and prediction accuracy are taken into consideration (wherever they are available). Figure 10 presents the comparison of our results with previous works' when X-ray images are used, and Figure 11 presents the comparison when CT images are used.  The following tables summarize and compare the accuracies obtained by previous works and our study for each image modality. They also mention the key points of the methodology in each of the referred works. Table 7 compares and summarizes previous works based on X-ray together with our results that were obtained when the models were trained with X-ray images. Similarly, Table 8 shows the previous works and our work when trained with CT images.
It is seen that our methods have comparatively lower results in terms of accuracy. This can be due to several factors-lower number of epochs, using models with lesser layers etc. While the initial objective of the study was to employ deep learning techniques in COVID-19 detection that work with adequate accuracies, the main aim was to find which imaging modality provided higher accuracies when used to detect COVID-19. An observable trend is that the performances of the model changes with the change in the imaging modality. Hence, comparing CT and X-ray imaging becomes vital as it will help develop an in-depth understanding of how a particular pre-trained network interprets the details provided by the radiological modalities to detect COVID-19.  When trained with CT and X-ray images, VGG19 and Xception have the best validation accuracies, but their associated prediction accuracies are the lowest. Validation accuracies depict the generalizability of the network while prediction accuracies are calculated using the obtained true and false positive and negative values. Oftentimes, a larger difference between these values indicates overfitting. In terms of prediction accuracies, Xception and VGG16 show promising results when trained with CT and X-ray images, respectively. Their accuracies stand at 95% for Xception and 80% for VGG16.
Predicts COVID-19 from chest X-ray images without the need of feature extraction. Dataset names COVID-X-ray-5k was used on 4 pre-trained CNNs-ResNet18, ResNet50, SqueezeNet and, DenseNet-121. Model accuracy was based on sensitivity, ROC Curve and the precision recall curve. 13 SqueezeNet provided the highest accuracy of about 98%. 13 T. Ozturk et al.
DarkCovidNet detects COVID-19 using X-ray images of the chest. The model is used for both the two-classifier and the three-classifier problem. 14 The two-classifier problem provided higher accuracy at 98%. 14 M. E. Chowdhury et al. (2020) The presented method uses collected chest X-ray images to train pre-trained models to detect COVID-19 pneumonia. 8 shallow networks and 5 deep networks were employed. Two different experiments-two class image, and three class image classification were done. 15 The two-class image experiment provided a higher accuracy at 99.7%. 15 Presented Method 3 non-pre-trained and 5 pre-trained models were trained and tested. Non-pre-trained models were rejected due to their low validation accuracies. The pre-trained models that were used are VGG16, VGG19, ResNet50, Inception V3, and Xception. All 8 models were trained and tested with X-ray and CT images separately.
Xception provided the highest validation accuracy at 88% when the model was trained with X-ray images. The obtained accuracies can be further improved by increasing the number of epochs during training, however, increasing epochs would also mean increasing the training time. Another possible improvement would be to add on to the already present dataset for better training.

Conclusion
This study was conducted to compare the effectiveness of radiological imaging modalities in the detection of COVID-19, with the help of deep learning techniques. A total of 8 unique models were used-3 non-pretrained, CNN models and 5 pretrained models. From the obtained results, it can be noted that Xception provided the highest validation accuracy when X-ray images are used and VGG19 for when CT images are used. When the models are compared in terms of prediction accuracy, VGG16 and Xception provided the highest values when trained with X-ray and CT images respectively. The model, VGG16, showed the most consistent performance for both X-ray and CT images. In terms of validation accuracy, models trained with X-ray images have shown better results than the same models trained with CT images for both nonpretrained and pre-trained models even though training with X-ray images have shown to take much longer time. The dataset is curated and augmented by Walid El-Shafai and Fathi Abd El-Samie. The dataset consists of COVID-19 and Non-COVID-19 folders for both X-ray and CT images. There were 5500 non-COVID-19 and 4044 COVID-19 cases belonging to the X-ray folder. For CT images, there were 2628 non-COVID-19 and 5427 COVID-19 images, which brings the total number of images to 17,599.

Data availability
The presented model, EfficientCovidNet, is a high-quality deep learning model that can be used for screening COVID-19 from Chest CT Images. The performance of the proposed model was evaluated by testing it in 3 different scenarios-Random, Slices and Voting, using a 5-fold-cross-validation. The paper also presented a cross-dataset analysis to mimic real-life situation and unveil drawbacks. 9 Testing the model in the 'Random' situation yielded better results. It was found that the initial accuracy of 87.68% of the model dropped to 56.16% during the cross-dataset analysis. 9 A. K. Mishra et al. (2020) Discusses the performance of various deep learning models using CT Images. Detection uses decision fusion. 10 The decision fusion model outperformed the other pre-trained models with an accuracy of 88.34%. 10 V. Shah et al.
Presented model, named CTnet-10, was compared with 5 pre-trained models-VGG16, DenseNet161, ResNet50, Inception V3 and VGG19. 11 It was observed that the CTnet-10 had an accuracy of 82.1%, which was lower than the other pre-trained models. Despite the lower accuracy, the presented model took the least time to train, test and execute. 11 Proposed method 3 Non-Pre-Trained and 5 Pre-Trained models were trained and tested. Non-Pre-Trained models were rejected due to their low validation accuracies. The pre-trained models that were used are VGG16, VGG19, ResNet50, Inception V3, and Xception. All 8 models were trained and tested with X-ray and CT images separately.
VGG19 provided the highest validation accuracy at 81.2% when the model was trained with CT images.
Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).