Estimation of Covid-19 lungs damage based on computer tomography images analysis [version 2; peer review: 1 approved with reservations, 1 not approved]

Modern treatment is based on reproducible quantitative analysis of available data. The Covid-19 pandemic did accelerate development and research in several multidisciplinary areas. One of them is the use of software tools for faster and reproducible patient data evaluation. A CT scan can be invaluable for a search of details, but it is not always easy to see the big picture in 3D data. Even in the visual analysis of CT slice by slice can inter and intra variability makes a big difference. We present an ImageJ tool developed together with the radiology center of Faculty hospital Královské Vinohrady for CT evaluation of patients with COVID-19. The tool was developed to help estimate the percentage of lungs affected by the infection. The patients can be divided into five groups based on percentage score and proper treatment can be applied


Introduction
The covid pandemic that has affected in recent months has revealed a number of strengths and weaknesses in health systems around the world.
One of the key ideas is a quick and accurate diagnosis of the patient, which was problematic in congested hospitals. Software engineering and image processing methods could be helpful in speeding up and refining patient diagnosis, especially in radiological and radiodiagnostic workplaces, where a large part of diagnostic processes take place over image data (CT, NMR, X-ray). Recent advances in image analysis motivate for a more collaborative approach to quantitative analysis since it usually requires expertise in bioimage analysis. 1,2 Various software tools have been used for this purpose for years. In general, it is possible to divide them into two groups: • universal software packages: used for general analysis of image data such as filtering, smoothing or image registration • software tools "made to measure": concrete software tools for analysis of rare diseases The first group of tools is represented mostly by software integrated into packages supplied by the tomograph developer. It is possible to mention a software tool for CT image preprocessing and automated analysis of three standard phantoms 3 or a software tool for reducing metal artifacts in dental care. 4 The second group of tools is from both the research and application point of view much more interesting. It is necessary to state that only a small part of them is applied in a real clinical environment. It is possible to mention a tool for analysis of GPA disease using image registration and self-organizing maps, 5 or a tool for analysis of peripheral bypass grafts. 6 Many research groups focused on precise measurement of pathological findings, 3D analysis, or volumetric analysis. 7,8 Moreover, some papers deal with image fusions from different scanners e.g. combination of data from CT, PET/CT, SPECT/CT, or MR. 9,10 Thus, the topic of CT image analysis of "covid lungs" is essential from both the research point of view (there is still room for further research in precise semi-automatic analysis) and the clinical point of view.
The availability of tools for scientific research remains a challenge for both researchers and end-users. Although access to scientific papers is increasingly open, reproducible resources, code, and data availability is not yet widespread. Access to the results of scientific studies is crucial, but access to the necessary tools makes a real difference. Unfortunately, the code is not often available in open-source form, complete with step-by-step tutorials and opportunities for reporting issues. While software such as ImageJ 11 and 3D Slicer 12 exists for image analysis, they are geared toward experienced image analysts. They may not be user-friendly for end-users who are not familiar with creating analysis workflows. The enduser often depends on core facilities or available documentation and tutorials for support. The 3D Slicer CT Lungs Analyzer project for lung analysis is still in development and relies on Unet deep learning segmentation, but it is promising. There is a need for a portable, user-friendly software tool for reproducible quantitative analysis of CTs to estimate covid lung pneumonia.
Therefore, the aim of software paper is to present a semi-automatic software for "covid lungs" CT image analysis, based on knowledge presented in Ref. 13. The authors based the idea on the correlation between the degree of lung involvement and the course of the disease. The global score (0-25) of lung score involvement is calculated based on the extent of volume involvement (0: 0%, 1: <5%, 2: 5-25%, 3:26 -50%, 4:51-75%, 5, > 75%). The authors then introduce the role of

REVISED Amendments from Version 1
The manuscript underwent a thorough review by two experts who provided valuable feedback. The first reviewer questioned the radiographic findings reported in the abstract and requested more information about the practical use of the image analysis software tool. In response, the authors clarified that the findings were based on the percentage of lung coverage and updated their ImageJ software tool. The second reviewer suggested enhancing the argument for the tool's novelty, robustness, and importance by comparing it with other tools available. The authors addressed these concerns by emphasizing the need for a user-friendly, open-source tool for reproducible quantitative analysis of CT scans, sharing their work openly to help other hospitals facing similar challenges, and providing technical details and code for software development. On top of that the article now also includes an improved step-by-step explanation of workflow, improved versions of software tool, and inter and intra-variance comparisons. The updated software tool and scripts are available through the repository or GitHub.
Any further responses from the reviewers can be found at the end of the article CT score in predicting the outcome of SARS-CoV-2 patients. The scoring is highly correlated with laboratory findings, disease severity and mortality. Moreover, it might speed up diagnostic workflow in symptomatic cases.

Image format
The Covid CT estimation tool is based on standard image processing techniques. Our interest is in volume, so the same voxel size is critical for good enough estimation. But it is also important to go through the different types of data we can encounter. In general, the Hounsfield Units (HU) make up the grayscale in medical CT imaging. It is a scale from black to white of 4096 values (12 bit) and ranges from -1024 HU to 3071 HU (zero is also a value). It is defined by the following: -1024 HU is black and represents air (in the lungs). 0 HU represents water (since we consist mostly out of the water, there is a large peak here). 3071 HU is white and represents the densest tissue in a human body, such as tooth enamel. Materials with higher atomic numbers, such as bones, appear as brighter areas on CT images and are assigned higher HU values (typically between +700 and +3000). All other tissues are somewhere within this scale; fat is around -100 HU, muscle around 100 HU, and bone spans from 200 HU (trabecular/spongeous bone) to about 2000 HU (cortical bone).
DICOM files are usually saved in signed 16 bit, with original HU, usually with 3 mm slicing or 0.6 mm slicing CT images. TIFF, however, may have reshaped histogram values to cover the whole range and can preferably be in unsigned 16 bit or 8bit with some loss due to conversion. TIFF values usually lose Z voxel size metadata in conversion (resulting in Z voxel size value of 1), so it is essential to reset voxel values. The XY voxel size can be different with each data set, even from the same CT machine. The distribution of intensity values may change with different CT protocols, so some of the processing steps need to be done manually.

Implementation
The workflow follows the Croney Ethical guidelines for the appropriate use and manipulation of scientific digital images. 14 The plugin tool is developed in ImageJ macro language. It needs Bio Format plugin to import DICOM files, which comes installed in FIJI. The macro language uses standard image processing techniques and morphological operations to estimate the volume ratio of lungs and pneumonia caused by COVID-19. It allows users to subsequently set up a threshold for pneumonia and lungs, and go through the whole data-set slice by slice and interactively tweak the threshold values.
The tool was developed based on demand and with coordination from the Department of Radiology from the Faculty hospital Královské Vinohrady. It is challenging to do any kind of percentage estimate of pneumonia in the lungs just by visually inspecting CT scans stack by stack. The available hardware equipment and local account restrictions had to be taken into account for development tool selection. The ImageJ plugin is a compromise in accuracy and requirements. The scripts are published with the paper. The workflow for 8-bit script version is following: j) Duplicate the stack of lung slices twice, creating two separate stacks for lung thresholding and pneumonia thresholding.
2) Analysis a) For the stack of lung slices, convert the image to 8-bit and apply a threshold to remove all but the lung tissue.
b) Lungs i). Get the user's input on the threshold values for the lung tissue.
ii). Convert the thresholded image to a mask and clean the mask using erode, dilate, and fill holes operation.
iii). Analyze the selection in the mask to separate the individual lung regions in each stack. iv). Save the processed image as a TIF file in a new directory with the date and time as part of the file name.
c) Pneumonia i). Get the user's input on the threshold values for pneumonia.
ii). Convert the thresholded image to a mask and clean the mask using erode and dilate.
iii). Combine the mask with the lung mask using an AND operation. iv). Analyze the particles in the mask. v). Save the processed image as a TIF file in a new directory with the date and time as part of the file name.
3) Evaluation a) Create a new image with CT data as channel 1, lung mask as channel 2 and pneumonia mask as channel 3.
Save the composite as TIF in the results folder.
b) Get the total area of lungs and total area of pneumonia for all stacks.
d) Save log containing information about the whole process in results folder.
Numeric result and composition image representation from step 3.a (original data, lung and pneumonia mask) is shown to the user (as illustrated in Figure 1).

Operation
There are several steps during the tool runtime which require user inputs: 1. Select the CT lung data ( Figure 2, TIFF or DICOM file based on script version) -CT sequence is opened and user can go through loaded stack in image sequence with a slider or as a video with a play button.
2. "Please find the start of lungs in stack" -user has an option to select the first image with lungs with a slider and confirm the selection with "Ok" button.
3. "Please find the end of lungs in stack" -user has an option to select the last image of lungs selection with the slider and confirm with "Ok" button. The tool works with the images only in between the chosen interval of the lungs stack to minimize the computational effort.
4. "Setup threshold for all but body" -the whole image-exclude the body, shall be highlighted with red colour. The tool makes automatic estimation, and the user can adjust the threshold with the sliders on the histogram. Confirm with the "Ok" button. 5. "Setup threshold of Covid" -the covid threshold shall be highlighted with red colour. The tool makes automatic estimation, and the user can adjust the threshold with the sliders on the histogram. It is not a problem if part of the body (not lungs!) will be chosen together with Covid. The tool automatically subtracts the body threshold from the chosen Covid threshold. Confirm with the "Ok" button.  • After each calculation the tool adds information to the log window. The log file is automatically saved to the CT data directory. The output lungs and covid masks are saved in TIFF format into an additional folder in the CT data location.
• The tool provides % estimation of Covid damage in the lungs and a semi-quantitative CT score. The score is calculated based on the extent of lobar involvement (0:0%; 1, < 5%; 2:5-25%; 3:26-50%; 4:51-75%; 5, > 75%; range 0-5 based on the medical research "Chest CT score in COVID-19 patients: correlation with the disease severity and short-term prognosis. 13 The tool has been tested both 3 mm slicing and 0.6 mm slicing CT images. The results were similar in percentage and the final CT score was the same.
In order to use the tool, the user needs to prepare CT images exported as DICOM or TIFF in the preferred view mode and preferably 16-bit representation. The CT images usually have a 12-bit gray-scale representation and an 8-bit conversion would lead to loss of potentially important information or shift of brightness values. The thickness of the CT slice can also contribute to numerical errors in the process, but there was no significant difference in results when processing the same data-set with 3 mm and 0.6 slicing.
The ImageJ software tool available from Zenodo or GitHub needs an ImageJ (ideally version 1.52v99 or newer) installed with Bio-Formats (preferably with version 6.8.0 which we tested) plugin (or FIJI which is a version of ImageJ with an already integrated Bio-Formats plugin).
The minimal requirements for both are Windows XP or later with Java installed, Mac OS X 10.8 or later with Java installed, Ubuntu Linux 12.04 LTS, or later with Java installed. Minimal RAM is based on the size of processed images. In this case, multiple images are opened at once.

Use cases
The usability of the introduced tools is presented in the next sections. A use case for comparison for a CT measured with different slicing setup is presented. Results for a set of 5 CTs evaluated by different users is discussed. Since we were restricted by hardware, two versions of tool were created. One that works with 8-bit version of images and needs less RAM, and one that works with 16-bit signed images and can load HU units. The CT scans of COVID-19 patients used in this section were provided by the Department of Radiology of Faculty hospital Královské Vinohrady, where the tool was tested and deployed in September 2021.

Slice thickness variation
The international standard for saving DICOM files defines 3 mm slicing of CT data as the default way. However resaving data as TIFF (losing voxel information) or using different slice thicknesses (like 0.6 mm slicing) may result in a different result. In theory, 0.6 slicing would provide 5 times more detailed sampling in the Z-axis. However, in practice it is different.
The same CT dataset exported with 0.6 and 3 mm slices (XZ view for comparison is in Figure 3) was analyzed with our tool with a lung threshold of 0-155 and a pneumonia threshold of 47-115. The results can be found in Table 1. The error from a comparison of 3 mm and 0.6 mm slicing is estimated at 0.58 %. The used CT is available in the attached published dataset as CT1_1 (0.6 mm slicing) and CT1_2 (3 mm slicing).

User inter and intra variability
The biggest challenge in using this tool is an individual perception of images, as each person may see image data fundamentally the same -despite different appearances. Based on this a user can add the biggest bias even though the underlying data analysis is done correctly. The Table 2 contains a comparison of the results of the analysis in on 5 different CT datasets provided by the Faculty hospital of Královské Vinohrady. All CTs are analysed by users with different experience. The first CT exported with different slicing (also used in Table 1) is analysed by a radiologist (an expert user). The score aims to divide the percentage into groups based on previous research done, 13 and should be the deciding factor for future care for patients.
Ensuring the reliability and accuracy of results when working with user-input tools is crucial and requires careful consideration of inter-and intra-variability. This challenge can be addressed by using standardized procedures and guidelines, multiple raters for segmentation, and computer-aided methods. Our software tool addresses this issue by providing standardized procedures and guidelines, along with the ability to compare results through logs and promote reproducibility.
It is essential to take into account the level of training and experience of individuals performing the segmentation, as well as the time and resources available, as these factors can significantly impact the consistency and accuracy of the segmentation. The software tool provides a solution for addressing the challenges of inter and intra variability in CT data segmentation, helping to ensure careful planning and execution of a study and appropriate outputs to achieve repeatable and comparable results.
Using scoring will overcome some of the problems of comparing percentages directly. Figure (Comparison of CT scoring) shows that users rely on their experience and will choose parameters based on them. It shows the difficulties in ensuring consistency and accuracy of data segmentation when performed manually by multiple individuals. The possibility of comparing the results of the analysis of multiple users using a defined analysis process (Figure 4a) leads to more reliable results. CT1_1 and CT1_2 is the same dataset with different slicing and percentage results of analysis from all users clearly show inter-and intra variability ( Table 2, thresholds and scores in Figure 4). The overall scoring is the same as the result from the trained radiologists (CT1_1 -31%, score 3; CT1_2 -32%, score 3).

Discussion
The ImageJ/FIJI tool can import various DICOM or TIFF files. Users should be always aware of whenever the saved data are using signed or unsigned bit depth, as unsigned data will shift pixel brightness. The same will happen when exporting data in different bit depth or with a specific CT view. The slicing of the CT dataset also matters, however, the analysis in Table 1 showed that it won't significantly affect either the percentage or the score (other CT machines might have different settings). A small case study for user inter and intra variability was made ( Table 2) to evaluate the usability of the proposed tool. Some expected variability in results occurs, interesting is inter variability in evaluating CT1 which is 3-5%. The intra variability is more extensive, up to 20%, and points out the fact that users should have at least some training in how to recognize pneumonia in CT images.

Conclusions
The tool was developed on demand from the Department of Radiology at the Faculty hospital Královské Vinohrady, as it was difficult for them to estimate the percentage and score of pneumonia in the lungs just by visually inspecting CT scans. Available hardware equipment and local account restrictions had to be taken into account for development tool selection. The ImageJ plugin is a compromise in accuracy and requirements. It logs all the user inputs for reproducibility and saves the results of all the steps as TIFF stacks. These masks and images can be used for visual inspection or possibly in the future for more advanced machine learning tools.
This software tool is the first step of a longer journey to create a tool that would be both easy to use for radiologists to diagnose COVID-19 based on CTs and include an advanced image analysis tool for percentage estimation of pneumonia in lungs. The use of open software promises ease of future development, however, it might be beneficial to move from ImageJ to 3D Slicer 12 or Napari 15 as they offer better tools for 3D visualization and integration of machine learning tools, which we aim to develop and integrate into our future works.

Limitations
The biggest limitation of this approach is human error and inter and intra variation of manual selection. The percentage estimation might also be affected by other body cavities filled with air. There might also be a variance in results based on slice thickness, in worst case scenario 20%, but our experiment shows that there is only about 0.58% difference in result between 0.6 and 3 mm CT slice thickness. The scoring should also be improved so it is not dependent only on one value (volume percentage), but normalized SHU distribution in the pneumonia area should be also considered. When converting from 12-bit to 8-bit image representation, the reduced range of values results in a loss of information and detail, which can lower the quality of the output. However, for CT image segmentation, the use of Single Hounsfield Unit (SHU) values is adequate, as SHUs do not rely on single units and can provide good-quality segmentation.
From the software point of view, there is a limitation in the version of ImageJ used. The new version of the code logs the ImageJ version and BioImage plugin version. There is a version of the code explicitly made for ImageJ version 1.52v99 and for other versions. The bind version helps reproducibility of any analysis based on logs, and it is advised to reproduce the analysis in the same version of ImageJ as indicated in logs.
This project contains the following underlying data:

Software availability
Zenodo: ImageJ tool for percentage estimation of pneumonia in lungs, https://doi.org/10.5281/zenodo.7885379. 17 The second version of the repository is restructured and contains both a new version of ImageJ scripts (.ijm files in folder tools) and ImageJ scripts published with the first version of this Software Tools article (subfolder 0.3c1 of folder tools).
The new folders inter_intra and repeatability contains the source files and Jupyter Notebook files used for the evaluation and the presented graphs.
This project contains the following underlying data: The newest version of code, tutorial, and the possibility to report any issues is through GitHub repository https://github. com/martinschatz-cz/ImageJ_Pneumonia_Estimation_Tool

Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool? No
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? No Our replies to your comments: 1. The Introduction should provide a strong argument for why the software tool is important.
Thank you for your valuable feedback. We agree that the Introduction should provide a strong argument for the software tool's importance. As we highlighted in the new paragraph, there is a need for a open source, user-friendly software tool for reproducible quantitative analysis of CT scans to estimate COVID lung pneumonia. Current software tools such as ImageJ and 3D Slicer may not be user-friendly for end-users who are not familiar with creating analysis workflows. Our software tool aims to fill this gap by providing a userfriendly solution for reproducible quantitative analysis, with available code, training data, tutorial and GitHub repository.

It is of importance to have sufficient results to justify the novelty of the proposed software tool.
The main motivation for the software tool was that no other similar software tool was available for use on the computer infrastructure of Faculty Hospital of Královské Vinohrady. The main obstacles were in internal network policy, the unavailability of any high performance computing hardware, and need of specific tool. Since more hospitals might be challenged in similar ways, the highest motivation was to share our work openly and freely to help innovate and enable them.
3. The robustness of the proposed software tool has not been addressed; this should be emphasized in the discussion section.
Since the software tool is user operated, there is high inter and intra variability of resultswhich we addressed in text. The robustness of whole workflow is now addressed in terms of repeatability.

What are the benefits of the proposed approach above other current the new software tool?
The only other open-source software tool adressing similar medical analysis is plugin SlicerLungCTAnalyzer for 3D Slicer software (available on GitHub: https://github.com/rbumm/SlicerLungCTAnalyzer). Of course there also exists commercially available software like Thoracic VCAR and others, but the money and hardware requirements might render these software unreachable for a lot of facilities, and since the tools are closed it might be enough for diagnosis but not for other research.
The benefits of our Software tool are that it is developed open-source, in its very low hardware requirements and its ease of use with low technical requirements for installation. These were also the requirements of the faculty hospital where it was requested. Is the rationale for developing the new software tool clearly explained? ○ We added a comparison to other freely available open source software tool as requested.
Is the description of the software tool technically sound? ○ We improved the pseudo-code on top of the available source code, which should improve the description overview of the whole tool. Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

○
The whole original code and its update is available through a linked (now updated) repository and on GitHub with a small tutorial. The data and software availability parts were extended. We added pseudocode for a better overview, but best option is to go through the code. Any encountered problems with the macro on specific machines and/or ImageJ version can be reported through an issue on GitHub repository. Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

○
The output of the software tool is a log of whole process and partial results, it is a necessary step for reproducibility, good data management and interpretation. If you encountered any issues with the tool or prepared protocols please report them in GitHub repositories so we can properly address them. We are happy to help, sustainability is important part of our project.
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article? ○ We included more analysis about performance of the tool and also inter and intra variability overview (also available on GitHub). Scripts are included in linked materials with both new version of Software Tool.
In the meantime we did optimization and some correction of reported issues of our ImageJ scripts, you are able to find them in the linked GitHub repository. It involves time and resultrobustness tests , inter and intra variance overview from reported parameters used, a new version including updated log protocols to help users troubleshoot possible issues with the version of ImageJ. New logs from both 8 and 16-bit versions of scripts now contain both the ImageJ version used and Bio-Image version used. We will welcome any issue reports or enhancement suggestions in the GitHub repository.
Thank you again for your valuable feedback. Martin Schätz correlation with disease severity and short-term prognosis" (https://doi.org/10.1007/s00330-020-07033-y) where the lungs are divided and the result of each part is summed up to the score. It would be possible to do the same with 3D Slicer and the "Lung CT Analyzer" (https://github.com/rbumm/SlicerLungCTAnalyzer) deep learning tool, however, it would require installing it and having more powerful hardware available. The software tool paper aims to present software tool that was made per request of Faculty Hospital, however analysis of different pattern of radiographic findings would be quite an interesting suggestion for a full research article.
2) Please describe, if you had success implementing the image analysis in your clinical practice. What are the benefits and/or barriers of using the analysis in the real world?
The ImageJ software tool was developed on request by the Facultypital of Královské Vinohrady, and it has been used by the Radiology center since September 2021. The score was mainly used for additional estimation to help doctors estimate the severity of COVID-19 disease in hospitalized patients. The first benefit is having a quick estimation of lung coverage by disease, which can be challenging to estimate by visual inspection. The second benefit is that the ImageJ and this specific SW Tool do not need installation (which can be challenging at the hospital IT infrastructure) and works on a average office computer. The challenge is mentioned inter and intra variability per user, but logging the whole process helps with that.
In meantime we did optimization and some correction of reported issues of our ImageJ scripts, you are able to find them in the linked GitHub repository. It involves time and result robustness tests , inter and intra variance overview from reported parameters used, a new version including updated log protocols to help users troubleshoot possible issues with the version of ImageJ. New logs from both 8 and 16-bit versions of scripts now contain both the ImageJ version used and Bio-Image version used. We will welcome any issue reports or enhancement suggestions in the GitHub repository.
Thank you again for your valuable feedback.

Martin Schätz
Competing Interests: No competing interests were disclosed.