ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article
Revised

Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine

[version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]
PUBLISHED 10 Oct 2024
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Artificial Intelligence and Machine Learning gateway.

This article is included in the Manipal Academy of Higher Education gateway.

Abstract

Background

Low back pain (LBP), the primary cause of disability, is the most common musculoskeletal disorder globally and the primary cause of disability. Magnetic resonance imaging (MRI) studies are inconclusive and less sensitive for identifying and classifying patients with LBP. Hence, this study aimed to investigate the role of artificial intelligence (AI) models in the prediction of LBP using T2 weighted MRI image of the lumbar spine.

Methods

This was a prospective case-control study. A total of 200 MRI patients (100 cases and controls each) referred for lumbar spine and whole spine screening were included. The scans were performed using 3.0 Tesla MRI (United Imaging Healthcare). T2 weighted images of the lumbar spine were segmented to extract radiomic features. Machine learning (ML) models, such as random forest, decision tree, logistic regression, K-nearest neighbors, adaboost, and deep learning methods (DL), such as ResNet and GoogleNet, were used, and performance measures were calculated.

Results

Our study showed that Random forest and AdaBoost are the most reliable ML models for predicting LBP. Random forest showed high performance with area under curve (AUC) values from 0.83 to 0.88 across all lumbar vertebrae and L2-L3, L3-L4, and L4-L5 intervertebral discs (IVDs), with AUCs of 0.88 the highest at L5-S1 IVD (0.92). Adaboost demonstrated high performance at the L2-L5 vertebrae with AUC values of 0.82 to 0.90, with the highest AUC (0.97) at the L5-S1 IVD. Among the DL models, GoogleNet outperformed the other models at 30 epochs with an accuracy of 0.85, followed by ResNet 18 (30 epochs) with an accuracy of 0.84.

Conclusion

The study demonstrated that ML and DL models can effectively predict LBP from MRI T2 weighted image of the lumbar spine. ML and DL models could also enhance the diagnostic accuracy of LBP, potentially leading to better patient management and outcomes.

Keywords

Deep learning, Machine learning, low back pain, intervertebral discs, lumbar vertebrae

Revised Amendments from Version 1

As per the reviewer’s suggestion, the advantages of using a wide range of machine learning algorithms and deep learning algorithms such as ResNet and GoogleNet were included in the methodology section. Highlights of the class variability were provided in the methodology section. Clinical significance of the results obtained from classification algorithms were provided in the discussion section.

See the authors' detailed response to the review by Tarun Gangil

Introduction

Low back pain (LBP) is the most prevalent musculoskeletal condition worldwide and the leading cause of disability. In 2019, it held the 9th position in disability-adjusted life years (DALYs) accounting for 2.5% of the overall “DALYS.” LBP was the primary cause of years lived with disability (YLDs), representing 7.41% of the total YLDS. In 2020, there were more than half a billion prevalent cases of LBP globally, and projections indicate that this number will exceed 800 million by 2050. Although age-standardized rates have slightly decreased over the past three decades, the number of LBP cases continues to increase owing to population growth and aging, particularly in Asia and Africa.1,2 LBP can be caused by various factors, including lifestyle, psychological, and social factors. To reduce the incidence of LBP, it is essential to address modifiable risk factors, such as smoking and obesity, which are associated with a high risk of developing condition.3,4 LBP can result from injuries or degenerative changes in the lumbar region, including facet joints, intervertebral discs (IVD’s), ligaments, and muscles. It is also associated with annual tears, disc height reduction, facet degeneration, and end-plate abnormalities such as Schmorl’s nodes, fractures, erosion, and calcifications.58

MRI of the spine is a noninvasive technique that is regarded as the gold standard for detecting and diagnosing spinal diseases. T2 weighted MRI enhances tissue contrast and offers greater sensitivity than traditional CT imaging for diagnosing conditions such as IVD herniation, nerve root entrapment, and spinal canal stenosis. MRI can identify IVD degeneration and vertebral endplate changes, which are associated with clinically significant LBP. Imaging studies have revealed that 87% of asymptomatic individuals also exhibit lumbar IVD abnormalities on MRI.913

Radiomics is a vital medical technique used in clinical practice for evaluation, diagnosis, selection of a course of treatment, and monitoring. Radiomics, a rapidly advancing artificial intelligence (AI) method in medical imaging, can objectively, reproducibly, and efficiently extract numerous quantitative features from medical images. These features are used to develop radiomic models or signatures that aid in interpreting various clinical phenotypes, such as patient genotyping, treatment efficacy, and clinical outcomes.1417

AI encompasses systems that can generate accurate interference from large datasets using advanced computational algorithms. Similar to humans, machines require learning for intelligent behavior. Therefore, AI includes various learning algorithms, such as machine learning (ML) and increasingly popular deep learning (DL) algorithms. Although AI originated in the 1950s, its development has accelerated since 2000, owing to advancements in computational power. Currently, AI technology provides indispensable tools for intelligent data analysis, particularly for solving medical diagnostic problems. The relationship between radiomics and AI is symbiotic. The high-dimensional nature of radiomics demands powerful analytic tools, and AI, with its advanced capabilities, is well-suited for this task. Conversely, AI applications with medical images rely on radiomics because the metrics used to train and build AI models are derived from radiomic approaches, specifically through feature extraction and feature engineering techniques.1826

Few studies have assessed the utility of radiomics-based ML models and DL techniques for predicting LBP. Hence, this study aimed to investigate the role of AI models in the prediction of LBP using T2 weighted MRI image of the lumbar spine.

Methods

This was a prospective, case-control study. The institutional ethical committee (IEC2:179/2023) was obtained from Kasturba medical college and Hospital, Manipal, India on 20th July 2023, followed by the Clinical trial registry (CTRI) registration: CTRI/2023/08/056954, 25/08/2023, https://ctri.nic.in/Clinicaltrials/login.php. Written informed consent (IC) was obtained from all the participants.

Eligibility criteria

Patients referred for MRI of the lumbar spine and whole-spine screening were included. The patients were screened using the questionnaire “Delphi definitions of low back pain prevalence (DOLBaPP)”27 questionnaire to check for LBP prevalence. Patients were considered symptomatic if they experienced LBP for 12 months. Patients were considered asymptomatic if they experienced no current back pain and no memory (severe or disabling) back pain. A total of 100 cases and controls (Mean age; cases: 48.48±16.1 years; controls: 51.46±18.4 years) were included. Demographic details of the patients are shown in Table 1. Patients with tumors, severe osteoporosis, or previous spine surgeries were excluded.

Table 1. Showing the demographic details of the population.

Cases (Symptomatic)Controls (Asymptomatic)
Subject (n)100100
Age in years (Mean±SD)48.48±16.151.46±18.4
Gender42 (Females)40 (Females)
58 (Males)60 (Males)

MRI Image acquisition: All MRI scans were performed with 3.0 Tesla MRI (United Imaging uMR 780). The MRI image acquisition parameters are listed in Table 2.

Table 2. showing the acquisition parameters of T2 weighted MRI Lumbar spine.

ParameterSagittal T2
SequenceFast Recovery Fast Spin echo (FRFSE)
TR (msec)2494
TE (msec)100
Matrix size224 × 199
Slice thickness (mm)3.5
Flip angle (Degrees)90

Segmentation and radiomic feature extraction

The DICOM MRI images of MRI T2 weighted images of the lumbar vertebrae and disc space for each patient were loaded into the 3D slicer software (Version 4.10.2). Segmentation of the lumbar vertebrae and intervertebral discs (IVD) was performed manually (Figure 1). Radiomic features from the lumbar vertebrae and IVDs were extracted for both the cases and controls (Supplementary file 1).

5ad52faf-81f6-4667-8f35-2446fede6060_figure1.gif

Figure 1. Showing the segmentation of lumbar vertebrae and intervertebral disc on T2 weighted image.

Machine learning model

ML classifiers such as Random Forest, Decision tree, logistic regression, k-nearest neighbors (KNN), and AdaBoost were used. We have utilized a wide range of ML classifiers since different classifiers may perform better with data features and this allows in through benchmarking and the selection of optimal model for a specific problem. Each ML method has its own advantages, random forest excels in robust and accuracy, decision tree offers interpretability, logistic regression is effective for linear relationships, KNN is good for smaller dataset, adaboost improves performance by merging weak learners. The ML classifiers were run in the Conda virtual environment, which was integrated with Python (version 3.9.7).28 Several libraries such as NumPy,29 scikit,30 pandas,31 seaborn,32 matplotlib,33 and others were installed to support the analysis. The training of the models utilized 8 GB of RAM, along with an Intel®coreTM i5 Central Processing Unit (HP ProBook 440). The study was conducted on a 64-bit Windows operating system to run the classifiers.

Data normalization: This is an essential step because it assigns equal weight to each variable, preventing any single variable from disproportionately influencing the model performance owing to its large numerical values. In our study, the min-max normalization (rescaling) technique was employed for the entire dataset.

Feature selection: Mutual information method was used for the feature selection of the top 20 radiomic features at each lumbar vertebra and IVD for both cases and controls.

Model training and validation

The data were split into training and testing ratios of 80:20. The data were subjected to five-fold cross-validation, where different subsets were trained to assess model efficiency. The input data were split into five equal parts: four groups for training and five for testing using various permutations and combinations in the cross-validation process. The parameters were hypertuned using a grid search technique that automates this tuning to determine the best values.

Performance metrics of the ML models

The performance metrics of the ML models for the test dataset were assessed using accuracy, precision, F1 score, area under the curve (AUC), Hamming loss, Jaccord score, log loss, and Mathew’s correlation coefficient (MCC).

Validation of the testing model from the confusion matrix was assessed using

Accuracy=((TP+TN)/(TP+TN+FP+FN))
Precision=(TP/(TP+FP))=PPV
Recall=(TP/(TP+FN))
F1=(2_(Precision_Recall)/(Precision+Recall))

Where, TP - True positive, TN - True negative, FP - False positive, FN - False negative, PPV - Positive predictive value.

Deep learning Model: MRI Images of the lumbar spine were collected in the Joint Photographic Experts Group (JPEG) format. The input images were cropped and resized to 184 × 282 pixels to mainly include the lumbar vertebrae and disc space. Intensity normalization was performed on all images such that the pixel values across multiple images were normalized to the same statistical distribution, facilitating improved analysis of MRI images. Further as an assistance DL, a subset of AI, was used. Transfer learning is a DL technique in which pre-trained networks are utilized to train the model for custom usage. One of the major advantages of this method is that it avoids training the network from scratch by using weights that are trained on the 1000 class ImageNet dataset. There are multiple pre-trained models in which GoogleNet and ResNet (18 and 50) were used (Figures 24). These are convolutional neural networks, meaning that convolution layers play a major role. These are feature extraction layers that perform convolution with different kernels. GoogleNet and ResNet were chosen due to their powerful ability to learn complex patterns in data, especially in image analysis, medical diagnostics and classification problems. Both are exceptional at automatically learning deep features particularly involving images, where they can capture complex and minute details. GoogleNet inception modules process input using parallel convolution layers with varying kernel sizes, enhancing efficiency by capturing features at different scales with fewer parameters. ResNet solves the vanishing gradient issue, making it possible to train very deep networks efficiently. This permit learning more complicated representations improves performance and tasks like image classification and object recognition.

5ad52faf-81f6-4667-8f35-2446fede6060_figure2.gif

Figure 2. Architectural configuration delineating the structure of ResNet50.

5ad52faf-81f6-4667-8f35-2446fede6060_figure3.gif

Figure 3. Architectural configuration delineating the structure of ResNet18.

5ad52faf-81f6-4667-8f35-2446fede6060_figure4.gif

Figure 4. Architectural configuration delineating the structure of GoogleNet.

GoogLeNet is a part of the inception model, which is a 22-layer deep network that is computationally efficient. On the other hand, ResNet has an advantage over other networks because it consists of skip connections, which avoids the vanishing gradient.

The dataset was divided into a 90:10 training and test split ratio. The training set was further divided into training and validation sets. The validation of the dataset occurs simultaneously during training. Deep learning (DL) models were implemented using MATLAB 2023b owing to its better visualization and ease of use.

To obtain optimum results, the hyperparameters were adjusted. Epochs are the number of times the entire dataset is passed through the network for training. In this case, the dataset was trained for 30, 50, and 100 epochs, respectively. Initial learn rate is the amount of learning that happens at a step i.e. step size at which parameters are updated during training process. This was set to 0.001. The optimizer used was a Stochastic Gradient Descent with momentum (sgdm).

DL model performance was assessed using the specificity, sensitivity, Precision, NPV, Recall, F1 score.

A binary classification problem which helps in predicting the LBP was used for ML and DL methods.

Results

In our study we included 100 symptomatic and asymptomatic cases.

The mean age and sex of the symptomatic and asymptomatic cases are shown in Table 1.

In our study, we analyzed ML models based on radiomic features and DL methods to predict LBP in symptomatic and asymptomatic cases.

Feature reduction for ML model development: The top 20 radiomic features for each lumbar vertebra and IVD were identified using a mutual information algorithm and are presented in Tables 3, 4.

Table 3. Showing the top 20 radiomic features selected at each vertebrae level for ML Models.

S.No.VertebraeFeatures selected for ML models
1L1Dependence entropy, Run Length Non-Uniformity, Energy, Zone Entropy, Total Energy, Size Zone Non-Uniformity Normalized, Large Area Emphasis, Maximum 2D Diameter Row, Small Area Emphasis, Zone Variance, Robust Mean Absolute Deviation, Idn, Least Axis Length, Difference Variance, Strength, Imc2, Long Run High Grey Level Emphasis, Joint Entropy, entropy, Large Dependence High Grey Level Emphasis
2L2Surface Area, Dependence Entropy, Total Energy, Long Run High Gray Level Emphasis, Maximum 2D Diameter Row, Entropy, Idn, Imc2, Energy, Kurtosis, Least AxisLength, Small Dependence Low Gray Level Emphasis, Mean, Cluster Tendency, Run Percentage, Run Length Non Uniformity, Size Zone Non Uniformity Normalized, MCC, Long Run Low Gray Level Emphasis, Large Dependence Emphasis
3L3Least Axis Length, Idmn, Long Run High Grade Level Emphasis, Mean, Run Entropy, Run Length Non Uniformity Normalized, Strength, Kurtosis, Inverse Variance, Maximum3D Diameter, Run Variance, Sum Entropy, Auto Correlation, Maximum, Size Zone Non Uniformity Normalised, Short Run Emphasis, Skewness, Small Dependence High Gray Level Emphasis, Large Dependence High Gray Level Emphasis, Correlation
4L4Minor Axis Length, Large Dependence Emphasis, Gray Level Variance, Gray Level Non-Uniformity Normalized, Coarseness, Sum Squares, Difference Variance, Maximum 2D Diameter Row, Large Area High Gray Level Emphasis, Id, Elongation, Zone Entropy, Contrast, Maximum 2D Diameter Slice, Large Area Low Gray Level Emphasis, Mean, Large Dependence Low Gray Level Emphasis, Least Axis Length, Root Mean Squared, Low Gray Level Emphasis
5L5Contrast, Maximum, Small Area Low Gray Level Emphasis, Mean, Complexity, Least Axis Length, Range, Large Area Low Gray Level Emphasis, Idmn, Gray Level Non-Uniformity, Interquartile Range, Run Variance, 90 Percentile, Cluster Shade, Difference Entropy, Gray Level Variance, Maximum Probability, Gray Level Non Uniformity, Large Dependence Low Gray Level Emphasis, Run Entropy

Table 4. Showing the top 20 radiomic features selected at each IVD for ML Models.

S.No.Intervertebral discFeatures selected for ML models
1L1-L2Sum Squares, Cluster Tendency, Zone Variance, Root Mean Squared, Short Run High Gray Level Emphasis, Joint Average, Sum Average, Large Area Emphasis, Gray Level Non-Uniformity Normalized, Entropy, Cluster Shade, Short Run Emphasis, High Gray Level Run Emphasis, Maximum 2D Diameter Row, Run Percentage, Dependence Entropy, Contrast, Median, Run Entropy, Interquartile Range
2L2-L3Maximum 2D Diameter Row, Short Run Emphasis, Maximum, High Gray Level Zone Emphasis, Maximum Probability, Difference Variance, Dependence Entropy, Id, Gray Level Non Uniformity Normalized, Contrast, Gray Level Non Uniformity Normalized, Gray Level Variance, High Gray Level Emphasis, Skewness, Range, Joint Entropy, Cluster Prominence, Run Variance, Low Gray Level Run Emphasis, Complexity
3L3-L4High Gray Level Zone Emphasis, Maximum 3D Diameter, Maximum, Range, Robust Mean Absolute Deviation, Auto correlation, Small Area High Gray Level Emphasis, Low Gray Level Zone Emphasis, Mean, Run Variance, Zone Entropy, Interquartile Range, Energy, Sum Entropy, Joint Average, Sum Average, Gray Level Variance, Long Run High Gray Level Emphasis, Cluster Prominence, Gray Level Normalized
4L4-L5Low Gray Level Emphasis, Dependence Entropy, Entropy, Minor Axis Length, Correlation, Small Area High Gray Level Emphasis, Maximum Probability, Difference Variance, Dependence Non Uniformity Normalized, Contrast, McC, Sum Average, Joint Average, Maximum 2D Diameter Column, Dependence Non Uniformity, Maximum 2D Diameter Row, Run Variance, Run Length Non Uniformity Normalized, Dependence Variance, Energy
5L5-S1Flatness, Zone Percentage, Least Axis Length, Entropy, Joint Entropy, Gray Level Non Uniformity Normalized, Joint Energy, Gray Level Non Uniformity, Size Zone Non Uniformity, Coarseness, Dependence Non Uniformity, Surface Area, Low Gray Level Zone Emphasis, Short Run High Gray Level Emphasis, Maximum, Cluster Shade, Short Run Emphasis, Dependence Non Uniformity Normalized, Range, Run Length Non Uniformity.

Machine learning (ML) classifiers

In this study, ML methods such as Random Forest, Decision tree, Logistic regression, KNN, Adaboost were studied. The performance of the ML classifiers at the lumbar vertebrae and intervertebral disc using five-fold cross-validation is shown in Tables 5, 643 for the five classifier models.

Lumbar vertebrae

The random forest showed high performance across all lumbar vertebrae, with AUC values from 0.83 to 0.88 across all lumbar vertebrae. Decision tree models exhibited moderate performance with AUC values between 0.65 and 0.76, suggesting lower predictive accuracy compared to other models. Logistic regression performed well, particularly at L5 with an AUC of 0.82, and maintained good performance across other vertebral levels with AUC values from 0.73 0.79. KNN also showed strong performance, especially at L2-L4 vertebrae with AUC values of 0.79 to 0.83, and slightly lower AUC values at L1(0.70) and L5(0.68). AdaBoost demonstrated high performance at L2–L5 vertebrae with AUC values of 0.82 to 0.90, although its performance at L1 was moderate, with an AUC of 0.67. The ML models showed slightly improved performance at the lower vertebral levels (L4 and L5) compared to the upper vertebral levels (L1-L3) Figures 5, 6.

5ad52faf-81f6-4667-8f35-2446fede6060_figure5.gif

Figure 5. ROC curve and confusion matrix for random forest (a,b) and adaboost (c,d) at L4.

5ad52faf-81f6-4667-8f35-2446fede6060_figure6.gif

Figure 6. ROC curve and confusion matrix for random forest (a,b) and adaboost (c,d) at L5.

Lumbar intervertebral disc

Random forest showed strong performance at L2-L3, L3-L4 and L4-L5 IVD’s with AUCs of 0.88 and the highest at L5-S1 IVD (AUC-0.92). Decision tree models showed moderate performance, with the highest AUC at the L5-S1 IVD (0.85) and lower values at other disks, particularly at the L4-L5 IVD (AUC-0.65). Logistic regression showed the highest AUC at L3-L4 IVD (0.90) and maintained good performance at other disks, with AUC ranging from 0.79 0.87. KNN showed the highest AUC at the L4-L5 disk IVD (0.88) and moderately at other IVD disk between 0.73 and 0.78. Adaboost showed the highest AUC (0.97) at the L5-S1 IVD and exhibited strong results at the L2-L3 (0.86) and L3-L4 (0.83) IVD. The random forest and adaboost models showed high performance, particularly at the L5-S1 IVD (Figure 7).

5ad52faf-81f6-4667-8f35-2446fede6060_figure7.gif

Figure 7. ROC curve and confusion matrix for random forest (a,b) and adaboost (c,d) at L5-S1 IVD.

Deep learning methods

The performance measures of the DL methods for LBP prediction are presented in Table 7.

Table 7. Showing the performance measures of DL methods for test dataset in prediction of LBP.

DL ModelPerformance measures
SensitivitySpecificityPrecisionNPVFPRAccuracyF1 scoreMCC
ResNet50, 30 Epoch0.770.810.820.750.190.790.790.57
GoogleNet, 30 Epoch0.850.860.860.850.140.850.860.71
ResNet18, 30 Epoch0.800.890.910.770.100.840.850.69
ResNet50, 50 epoch0.780.830.850.750.160.800.810.61
GoogleNet, 50 Epoch0.830.850.850.830.150.840.840.68
ResNet18, 50 Epoch0.790.910.920.750.090.840.850.69
ResNet50, 100 Epoch0.780.830.850.750.160.800.810.61

GoogleNet at 30 epochs outperformed the other DL models in terms of accuracy (0.85) and F1 score (0.86) for predicting LBP. ResNet 18 at 30 epochs had the second highest performance, with high accuracy (0.84) and F1 score (0.85). ResNet 50 showed consistent results at both 50 epochs (0.80-accuracy and 0.81-F1 score) and 100 epochs (accuracy-0.80 and F1 score-0.81), but with slightly lower performance metrics than GoogleNet and ResNet 18 at 30 epochs (accuracy-0.84 and F1 score-0.85).

Discussion

In our study, we used ML and DL algorithms to predict LBP by using T2 weighted images of the lumbar spine. MRI studies have shown that a significant percentage of asymptomatic patients have abnormalities related to lumbar intervertebral discs. Imaging studies often fail to provide definitive answers regarding the source of pain. Imaging techniques are valuable tools for diagnosis and are often clinically inconclusive in identifying the precise etiology of low back pain. The high prevalence of asymptomatic abnormalities, risk of overdiagnosis, and lack of correlation between imaging findings and pain highlight the need for a cautious and judicious approach to the use of imaging in LBP management. Clinicians should rely on thorough clinical assessment and consider imaging findings as part of a broader diagnostic strategy rather than the sole determinant of patient care.3436

Our study noted that ML models showed improved performance at the lower vertebral levels (L4 and L5) compared to the upper vertebral levels (L1-L3) and random forest and adaboost models exhibited particularly high performance at the L5-S1 IVD. Abdollah et al.37 reported that texture features extracted from T2 maps revealed significant textural differences in the L5-S1 lumbar IVD, upper and lower endplate regions, and the L4-5 lower endplate regions between individuals who are symptomatic and asymptomatic of LBP, which may not be apparent to the naked eye. The IVD and endplate zones of patients with LBP were more anisotropic, suggesting different patterns of degeneration due to varying patterns of collagen network destruction. Increased anisotropy may indicate fluid redistribution and changes in hydrostatic pressure, causing an uneven load distribution in pain-sensitive areas. Differences in Gray Level Co-occurrence Matrix features such as contrast, energy, and homogeneity provide additional evidence for the hypothesis of unique degeneration patterns in LBP. The random forest algorithm and Gini importance index indicate energy as a unique feature for classification. Ketola et al.38 also reported difference in T2 weighted images analyzed using logistic regression to classify textural features based on a pain questionnaire in a sample of 518 subjects. The best classification accuracy (83%) and AUC (0.91) were achieved at the lowest two IVDS, with a specificity score of 83% and a sensitivity score of 82%. These results suggest that texture features in the lower lumbar discs (L4-L5 and L5-S1) are more predictive of LBP, supported by the findings of increased anisotropy and genetic correlations. Another study by Aggarwal et al.39 reported that decreased L2 and L4 disc heights significantly predicted LBP. They also reported that thickening of the ligamentum flavum, particularly at the lower lumbar levels, contributes to spinal stenosis and LBP.

The DL models used in our study were useful for predicting LBP using MRI. GoogleNet with 30 epochs showed the highest performance with an accuracy of 0.85 and an F1 score (0.86) for predicting LBP. Won et al.40 employed a CNN to automatically grade spinal stenosis on MRI images of 542 patients, obtaining accuracy measures of 83.0% and 77.9% in comparison to the ground truth assessed by two separate doctors. Jamaludin et al.41,42 developed a CNN that segments the vertebrae and intervertebral discs with an accuracy of 95.6%. This model also identifies disc narrowing, marrow changes, endplate defects, spondylolisthesis, and central canal stenosis, and performs Pfirrmann grading with accuracy percentages ranging from 70.1% and 95.4%. Additionally, it can directly highlight abnormalities of the IVD and vertebrae using heatmaps, referred to as evidence hotspots41,42

According to our study, ML and DL models could provide more efficient, reliable, noninvasive diagnostic insights by accurately identifying abnormalities in the lumbar vertebrae and intervertebral discs (IVDs), even in cases where conventional MRI image assessments were inconclusive. By improving the ability to predict LBP, ML and DL algorithms could guide better clinical-decision making, reducing unnecessary surgical interventions.

Our study had a few limitations. First, the sample size is sufficient for the initial analysis; a larger sample size could provide more robust results and improve the reliability of machine learning (ML), and deep learning (DL) models. Second, manual segmentation of the lumbar vertebrae and IVD is time-consuming and subject to inter-operator variability. Automated segmentation methods can enhance reproducibility and efficiency. Third, the study did not include risk factors, radiological findings, or their role in assessing LBP using machine learning methods.

Conclusion

Our study found that ML classifiers, such as random forest and adaboost, exhibited the highest performance, particularly in the lower lumbar vertebrae and IVD, while decision tree and logistic regression models showed moderate performance in the prediction of LBP. For DL methods, GoogleNet achieved the best results at 30 epochs, followed closely by ResNet, which demonstrated high precision and specificity. Our findings highlight the potential of advanced ML and DL techniques for accurately predicting LBP, with random forest, AdaBoost, and GoogleNet showing the most promising results.

Ethical approval

This was a prospective, case-control study. The institutional ethical committee (IEC2:179/2023) was obtained from Kasturba medical college and Hospital, Manipal, India on 20th July 2023, followed by the Clinical trial registry (CTRI) registration: CTRI/2023/08/056954, 25/08/2023, https://ctri.nic.in/Clinicaltrials/login.php. Written Informed consent (IC) was obtained from all the participants.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Sep 2024
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Muhaimil A, Pendem S, Sampathilla N et al. Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.12688/f1000research.154680.2)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 10 Oct 2024
Revised
Views
16
Cite
Reviewer Report 18 Nov 2024
Eugene Ozhinsky, University of California San Francisco, San Francisco, California, USA 
Not Approved
VIEWS 16
The manuscript describes a study using machine learning and deep learning techniques to predic whether the patients had lower-back pain. The study addresses an important issue of diagnosing lower-back pain. Here is a list of questions the authors could use ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Ozhinsky E. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.173049.r335609)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
5
Cite
Reviewer Report 30 Oct 2024
Shashi Kumar Shetty, K S Hegde Medical Academy, Mangalore,, Karnataka, India 
Approved
VIEWS 5
Major comments:
  • The study effectively explores the role of machine learning (ML) and deep learning (DL) models in predicting low back pain using T2 weighted MRI images. This novel method has substantial potential for improving non-invasive
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Shetty SK. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.173049.r335604)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
8
Cite
Reviewer Report 30 Oct 2024
Priyanka Chandrasekhar, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, Tamil Nadu, India;  Department of Allied Health Sciences, The Apollo University, Chittoor, Andhra Pradesh, India 
Approved
VIEWS 8
Major comments:
  • The study represents a progressive approach to diagnose low back pain (LBP), utilizing radiomics based machine learning (ML) and T2 weighted MRI image based deep learning (DL) models. In musculoskeletal imaging, precise pain source
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Chandrasekhar P. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.173049.r335830)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
22
Cite
Reviewer Report 12 Oct 2024
Tarun Gangil, The institute of cancer research, London, UK 
Approved with Reservations
VIEWS 22
Question 1: Authors should highlight , why they have chosen a wide range of machine learning algorithms, clustering algorithm (KNN) and deep learning algorithms such as ResNet and GoogleNet ?
No concerns anymore.

Question 2: Do the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gangil T. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.173049.r330660)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 10 Sep 2024
Views
27
Cite
Reviewer Report 01 Oct 2024
Tarun Gangil, The institute of cancer research, London, UK 
Approved with Reservations
VIEWS 27
Authors should highlight , why they have chosen a wide range of machine learning algorithms, clustering algorithm (KNN) and deep learning algorithms such as ResNet and GoogleNet ?

Do the analysis were performed only on the radiomics ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Gangil T. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.169735.r325460)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 10 Oct 2024
    Saikiran Pendem, Department of Medical Imaging Technology, Manipal College of Health Professions, Manipal Academy of Higher Education, Karnataka, 576104, India
    10 Oct 2024
    Author Response
    Question 1: Authors should highlight , why they have chosen a wide range of machine learning algorithms, clustering algorithm (KNN) and deep learning algorithms such as ResNet and GoogleNet ?
    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 10 Oct 2024
    Saikiran Pendem, Department of Medical Imaging Technology, Manipal College of Health Professions, Manipal Academy of Higher Education, Karnataka, 576104, India
    10 Oct 2024
    Author Response
    Question 1: Authors should highlight , why they have chosen a wide range of machine learning algorithms, clustering algorithm (KNN) and deep learning algorithms such as ResNet and GoogleNet ?
    ... Continue reading
Views
17
Cite
Reviewer Report 24 Sep 2024
Priyanka Chandrasekhar, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, Tamil Nadu, India;  Department of Allied Health Sciences, The Apollo University, Chittoor, Andhra Pradesh, India 
Approved
VIEWS 17
Major comments:
The research article presents a case-control study to assess machine and deep learning models for prediction of low back pain using T2-weighted MRI images of lumbar spine.
The utility of mutual information for radiomic feature ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Chandrasekhar P. Reviewer Report For: Role of Artificial intelligence model in prediction of low back pain using T2 weighted MRI of Lumbar spine [version 2; peer review: 2 approved, 1 approved with reservations, 1 not approved]. F1000Research 2024, 13:1035 (https://doi.org/10.5256/f1000research.169735.r325453)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 10 Sep 2024
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.