■在MR图像上评估腰椎间盘突出症(LDH)的严重程度对于选择合适的手术候选者至关重要。然而,MR图像的解释是耗时的并且需要重复的工作。本研究旨在开发和评估基于深度学习的诊断模型,用于腰椎轴向T2加权MR图像上的自动LDH检测和分类。
■在这项回顾性研究中,共分析了1115例患者;两个发展数据集(1015例患者,15249张图像)和一个外部测试数据集(100名患者,1273张图像)被利用。根据密歇根州立大学(MSU)的分类标准,专家对所有图像进行了一致的标记,并将最终的标记结果作为参考标准.自动化诊断模型包括FasterR-CNN和ResNeXt101作为检测和分类网络,分别。基于深度学习的诊断性能通过计算平均交集(IoU)来评估,准确度,精度,灵敏度,特异性,F1得分,接受者工作特性曲线下的面积(AUC),和具有95%置信区间(CI)的组内相关系数(ICC)。
■在内部测试数据集中获得了高检测一致性(平均IoU=0.82,精度=98.4%,灵敏度=99.4%)和外部测试数据集(平均IoU=0.70,精度=96.3%,灵敏度=97.8%)。在内部和外部测试数据集中,LDH分类的总体准确率为87.70%(95%CI:86.59%-88.86%)和74.23%(95%CI:71.83%-76.75%),分别。对于内部测试,所提出的模型在分类上取得了很高的一致性(ICC=0.87,95%CI:0.86-0.88,P<0.001),高于外部检测(ICC=0.79,95%CI:0.76~0.81,P<0.001)。在内部和外部测试数据集中,模型分类的AUC为0.965(95%CI:0.962-0.968)和0.916(95%CI:0.908-0.925),分别。
■自动诊断模型在检测和分类LDH方面实现了高性能,并与专家分类表现出相当大的一致性。
UNASSIGNED: The severity assessment of lumbar disc herniation (LDH) on MR images is crucial for selecting suitable surgical candidates. However, the interpretation of MR images is time-consuming and requires repetitive work. This study aims to develop and evaluate a deep learning-based diagnostic model for automated LDH detection and classification on lumbar axial T2-weighted MR images.
UNASSIGNED: A total of 1115 patients were analyzed in this retrospective study; both a development dataset (1015 patients, 15 249 images) and an external test dataset (100 patients, 1273 images) were utilized. According to the Michigan State University (MSU) classification criterion, experts labeled all images with consensus, and the final labeled results were regarded as the reference standard. The automated diagnostic model comprised Faster R-CNN and ResNeXt101 as the detection and classification network, respectively. The deep learning-based diagnostic performance was evaluated by calculating mean intersection over union (IoU), accuracy, precision, sensitivity, specificity, F1 score, the area under the receiver operating characteristics curve (AUC), and intraclass correlation coefficient (ICC) with 95% confidence intervals (CIs).
UNASSIGNED: High detection consistency was obtained in the internal test dataset (mean IoU = 0.82, precision = 98.4%, sensitivity = 99.4%) and external test dataset (mean IoU = 0.70, precision = 96.3%, sensitivity = 97.8%). Overall accuracy for LDH classification was 87.70% (95% CI: 86.59%-88.86%) and 74.23% (95% CI: 71.83%-76.75%) in the internal and external test datasets, respectively. For internal testing, the proposed model achieved a high agreement in classification (ICC = 0.87, 95% CI: 0.86-0.88, P < 0.001), which was higher than that of external testing (ICC = 0.79, 95% CI: 0.76-0.81, P < 0.001). The AUC for model classification was 0.965 (95% CI: 0.962-0.968) and 0.916 (95% CI: 0.908-0.925) in the internal and external test datasets, respectively.
UNASSIGNED: The automated diagnostic model achieved high performance in detecting and classifying LDH and exhibited considerable consistency with experts\' classification.