背景:三维(3D)超声(US)成像在无创监测患有脑室内出血的新生儿的侧脑室变化方面显示出希望。由于定义不清的解剖边界和低信噪比,在3DUS图像中分割侧脑室的完全监督方法需要经过训练的医生对大量的注释图像进行数据集,这很乏味,耗时,而且昂贵。在小数据集上训练完全监督的分割方法可能导致过拟合并因此降低其可泛化性。用于3DUS分割的半监督学习(SSL)方法可能能够解决这些挑战,但是大多数现有的SSL方法已经被开发用于磁共振或计算机断层扫描(CT)图像。
目标:为了快速发展,轻量级,和准确的SSL方法,专门针对美国3D图像,这将使用未标记的数据来提高分割性能。
方法:我们提出了一种SSL框架,该框架利用自动编码器网络的形状编码能力,对3DU-Net分割模型实施复杂的形状和大小约束。自动编码器创建了伪标签,基于3DU-Net预测分割,强制形状约束。然后,对抗性鉴别器网络确定图像是否来自标记或未标记的数据分布。我们使用了887张3D美国图像,其中87张带有手动注释的标签,800张图像未标记。将25/12/50、25/12/25和50/12/25图像的训练/验证/测试集用于模型实验。骰子相似系数(DSC),平均绝对表面距离(MAD),和绝对体积差(VD)被用作与其他基准进行比较的指标。基线基准是完全监督的香草3DU-Net,而双重任务一致性,形状感知的半监督网络,具有相关性意识的相互学习,和3DU-NetEnsemble模型被用作DSC的最新基准,MAD,和VD作为比较指标。使用Wilcoxon符号秩检验来测试DSC和VD的算法之间的统计显著性,其中阈值为p<0.05,并且使用Bonferroni校正校正为p<0.01。使用随机存取存储器(RAM)跟踪和可训练参数的数量来比较模型之间的计算效率。
结果:相对于基线3DU-Net模型,我们的形状编码SSL方法报告了6.5%的平均DSC改进,7.7%,4.1%,95%置信区间为4.2%,5.7%,和2.1%,分别使用25/12/50、25/12/25和50/12/25的图像数据分割。与基准3DU-Net相比,我们的方法仅使用了1GB的RAM增加,与3DU-Net集成方法相比,所需的RAM和可训练参数不到一半。
结论:根据我们广泛的文献调查,这是首次报道的工作之一,提出了一种SSL方法,该方法设计用于分割3DUS图像中的器官,特别是一种结合了未标记数据用于分割新生儿脑侧脑室的方法。与最先进的SSL和完全监督的学习方法相比,我们的方法产生最高的DSC和最低的VD,同时计算效率高。
BACKGROUND: Three-dimensional (3D) ultrasound (US) imaging has shown promise in non-invasive monitoring of changes in the lateral brain ventricles of neonates suffering from intraventricular hemorrhaging. Due to the poorly defined anatomical boundaries and low signal-to-noise ratio, fully supervised methods for segmentation of the lateral ventricles in 3D US images require a large dataset of annotated images by trained physicians, which is tedious, time-consuming, and expensive. Training fully supervised segmentation methods on a small dataset may lead to overfitting and hence reduce its generalizability. Semi-supervised learning (SSL) methods for 3D US segmentation may be able to address these challenges but most existing SSL methods have been developed for magnetic resonance or computed tomography (CT) images.
OBJECTIVE: To develop a fast, lightweight, and accurate SSL method, specifically for 3D US images, that will use unlabeled data towards improving segmentation performance.
METHODS: We propose an SSL framework that leverages the shape-encoding ability of an autoencoder network to enforce complex shape and size constraints on a 3D U-Net segmentation model. The autoencoder created pseudo-labels, based on the 3D U-Net predicted segmentations, that enforces shape constraints. An adversarial discriminator network then determined whether images came from the labeled or unlabeled data distributions. We used 887 3D US images, of which 87 had manually annotated labels and 800 images were unlabeled. Training/validation/testing sets of 25/12/50, 25/12/25 and 50/12/25 images were used for model experimentation. The Dice similarity coefficient (DSC), mean absolute surface distance (MAD), and absolute volumetric difference (VD) were used as metrics for comparing to other benchmarks. The baseline benchmark was the fully supervised vanilla 3D U-Net while dual task consistency, shape-aware semi-supervised network, correlation-aware mutual learning, and 3D U-Net Ensemble models were used as state-of-the-art benchmarks with DSC, MAD, and VD as comparison metrics. The Wilcoxon signed-rank test was used to test statistical significance between algorithms for DSC and VD with the threshold being p < 0.05 and corrected to p < 0.01 using the Bonferroni correction. The random-access memory (RAM) trace and number of trainable parameters were used to compare the computing efficiency between models.
RESULTS: Relative to the baseline 3D U-Net model, our shape-encoding SSL method reported a mean DSC improvement of 6.5%, 7.7%, and 4.1% with a 95% confidence interval of 4.2%, 5.7%, and 2.1% using image data splits of 25/12/50, 25/12/25, and 50/12/25, respectively. Our method only used a 1GB increase in RAM compared to the baseline 3D U-Net and required less than half the RAM and trainable parameters compared to the 3D U-Net ensemble method.
CONCLUSIONS: Based on our extensive literature survey, this is one of the first reported works to propose an SSL method designed for segmenting organs in 3D US images and specifically one that incorporates unlabeled data for segmenting neonatal cerebral lateral ventricles. When compared to the state-of-the-art SSL and fully supervised learning methods, our method yielded the highest DSC and lowest VD while being computationally efficient.