semantic segmentation

语义分割
  • 文章类型: Journal Article
    超声心动图是诊断先天性心脏病最常用的成像方式之一。超声心动图图像分析对于获得
准确的心脏解剖信息至关重要。可以使用语义分割模型来精确界定左心室的边界,并允许准确和
自动识别感兴趣的区域,这对 ;心脏病专家非常有用。在计算机视觉领域,卷积神经网络(CNN) 架构仍然占主导地位。在过去的十年中,现有的CNN方法已经证明对各种医学图像的分割非常有效。然而,这些 解决方案通常很难捕获长期依赖关系,特别是当它涉及到具有不同比例和复杂结构的对象的图像时。在这项研究中,我们
提出了一种有效的超声心动图图像语义分割方法
,通过利用
变压器架构的自我注意机制克服了这些挑战。所提出的解决方案提取了远程依赖关系,并以不同的规模有效地处理对象,提高各种 任务的性能。我们介绍了移窗变压器模型(双变压器),其中 编码解剖结构的内容和它们之间的关系。 我们的解决方案结合了SwinTransformer和U-Net架构,产生一个 U形变体。所提出的方法的验证是使用用于训练我们的模型的 EchoNet-Dynamic数据集进行的。结果表明,精度 为0.97,Dice系数为0.87,并集相交(IoU)为0.78。
SwinTransformer模型有望在语义上分割超声心动图
图像,并可能有助于协助心脏病学家自动分析和测量
复杂的超声心动图图像。
    Echocardiography is one the most commonly used imaging modalities for the diagnosis of congenital heart disease. Echocardiographic image analysis is crucial to obtaining accurate cardiac anatomy information. Semantic segmentation models can be used to precisely delimit the borders of the left ventricle, and allow an accurate and automatic identification of the region of interest, which can be extremely useful for cardiologists. In the field of computer vision, convolutional neural network (CNN) architectures remain dominant. Existing CNN approaches have proved highly efficient for the segmentation of various medical images over the past decade. However, these solutions usually struggle to capture long-range dependencies, especially when it comes to images with objects of different scales and complex structures. In this study, we present an efficient method for semantic segmentation of echocardiographic images that overcomes these challenges by leveraging the self-attention mechanism of the Transformer architecture. The proposed solution extracts long-range dependencies and efficiently processes objects at different scales, improving performance in a variety of tasks. We introduce Shifted Windows Transformer models (Swin Transformers), which encode both the content of anatomical structures and the relationship between them. Our solution combines the Swin Transformer and U-Net architectures, producing a U-shaped variant. The validation of the proposed method is performed with the EchoNet-Dynamic dataset used to train our model. The results show an accuracy of 0.97, a Dice coefficient of 0.87, and an Intersection over union (IoU) of 0.78. Swin Transformer models are promising for semantically segmenting echocardiographic images and may help assist cardiologists in automatically analyzing and measuring complex echocardiographic images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    深度学习最近在语义分割方面取得了重大进展。然而,目前的方法面临严峻的挑战。分割过程往往缺乏足够的上下文信息和注意力机制,低级特征缺乏语义丰富度,和高级特征遭受分辨率差。这些限制降低了模型准确理解和处理场景细节的能力,特别是在复杂的场景中,导致分割输出可能在边界描绘中不准确,区域分类错误,和处理不良的小或重叠的对象。为了应对这些挑战,提出了一种基于自适应注意力和深度融合的多尺度扩展卷积金字塔语义分割网络(SDAMNet)。具体来说,开发了扩展卷积Atrous空间金字塔池(DCASPP)模块,以增强语义分割中的上下文信息。此外,设计了一个语义信道空间细节模块(SCSDM),通过多尺度特征融合和自适应特征选择来改进重要特征的提取,增强模型对关键区域的感知能力,优化语义理解和分割性能。此外,构造了语义特征融合模块(SFFM),以解决低级特征的语义缺陷和高级特征的低分辨率。SDAMNet的有效性在两个数据集上得到了证明,揭示了平均交汇处(MIOU)显著改善了2.89%和2.13%,分别,与Deeplabv3+网络相比。
    Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce the model\'s ability to accurately understand and process scene details, particularly in complex scenarios, leading to segmentation outputs that may have inaccuracies in boundary delineation, misclassification of regions, and poor handling of small or overlapping objects. To address these challenges, this paper proposes a Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion with the Multi-Scale Dilated Convolutional Pyramid (SDAMNet). Specifically, the Dilated Convolutional Atrous Spatial Pyramid Pooling (DCASPP) module is developed to enhance contextual information in semantic segmentation. Additionally, a Semantic Channel Space Details Module (SCSDM) is devised to improve the extraction of significant features through multi-scale feature fusion and adaptive feature selection, enhancing the model\'s perceptual capability for key regions and optimizing semantic understanding and segmentation performance. Furthermore, a Semantic Features Fusion Module (SFFM) is constructed to address the semantic deficiency in low-level features and the low resolution in high-level features. The effectiveness of SDAMNet is demonstrated on two datasets, revealing significant improvements in Mean Intersection over Union (MIOU) by 2.89% and 2.13%, respectively, compared to the Deeplabv3+ network.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    大多数实时语义分割网络使用浅层架构来实现快速推理速度。这种方法,然而,限制了网络的接受领域。同时,特征信息提取仅限于单一尺度,这降低了网络的泛化和保持鲁棒性的能力。此外,图像空间细节的丢失会对分割精度产生负面影响。为了解决这些限制,本文提出了一种多尺度上下文金字塔池和空间细节增强网络(BMSeNet)。首先,为了解决单一语义特征尺度的局限性,介绍了多尺度上下文金字塔池模块(MSCPPM)。通过利用各种池化操作,该模块有效地扩大了感受野,并更好地聚合了多尺度上下文信息。此外,设计了空间细节增强模块(SDEM),有效补偿丢失的空间细节信息,显著提升空间细节感知。最后,提出了双边注意力融合模块(BAFM)。该模块利用像素位置相关性来指导网络为从两个分支中提取的特征分配适当的权重,有效地合并两个分支的特征信息。在Cityscapes和CamVid数据集上进行了广泛的实验。实验结果表明,所提出的BMSeNet在推理速度和分割精度之间取得了很好的平衡,优于一些最先进的实时语义分割方法。
    Most real-time semantic segmentation networks use shallow architectures to achieve fast inference speeds. This approach, however, limits a network\'s receptive field. Concurrently, feature information extraction is restricted to a single scale, which reduces the network\'s ability to generalize and maintain robustness. Furthermore, loss of image spatial details negatively impacts segmentation accuracy. To address these limitations, this paper proposes a Multiscale Context Pyramid Pooling and Spatial Detail Enhancement Network (BMSeNet). First, to address the limitation of singular semantic feature scales, a Multiscale Context Pyramid Pooling Module (MSCPPM) is introduced. By leveraging various pooling operations, this module efficiently enlarges the receptive field and better aggregates multiscale contextual information. Moreover, a Spatial Detail Enhancement Module (SDEM) is designed, to effectively compensate for lost spatial detail information and significantly enhance the perception of spatial details. Finally, a Bilateral Attention Fusion Module (BAFM) is proposed. This module leverages pixel positional correlations to guide the network in assigning appropriate weights to the features extracted from the two branches, effectively merging the feature information of both branches. Extensive experiments were conducted on the Cityscapes and CamVid datasets. Experimental results show that the proposed BMSeNet achieves a good balance between inference speed and segmentation accuracy, outperforming some state-of-the-art real-time semantic segmentation methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    铁路运输已经融入了人们的生活。根据国家铁路局2012年发布的"关于发布《高速铁路供电安全检测(6C系统)系统通用技术规范》的通知",要求在高速铁路车站安装受电弓和滑轨监测装置,车站喉部和高速铁路路段的进出口线,并且要求高精度地检测滑块的损坏。可以看出,受电弓滑块的良好状态对于铁路系统的正常运行非常重要。作为为高铁和地铁提供动力的一部分,在铁路运输中必须注意受电弓,以确保其完整性。受电弓的磨损主要是由于高速运行时滑块和长导线之间的接触供电,这不可避免地产生划痕,导致受电弓滑块上表面凹陷。在长期使用中,因为抑郁太深了,有骨折的危险.因此,有必要定期监测滑块,并更换磨损严重的滑块。目前,传统方法大多采用自动化技术或简单的计算机视觉技术进行检测,这是低效的。因此,本文将计算机视觉和深度学习技术引入到受电弓滑板磨损检测中。具体来说,本文主要研究基于深度学习的受电弓滑块磨损检测,主要目的是提高检测精度,提高分割效果。从方法论的角度来看,本文采用线性阵列相机来提高数据集的质量。此外,它集成了一种注意力机制来提高分割性能。此外,这项研究介绍了一种新颖的图像拼接方法,以解决与不完整图像有关的问题,从而提供全面的解决方案。
    Railway transportation has been integrated into people\'s lives. According to the \"Notice on the release of the General Technical Specification of High-speed Railway Power Supply Safety Testing (6C System) System\" issued by the National Railway Administration of China in 2012, it is required to install pantograph and slide monitoring devices in high-speed railway stations, station throats and the inlet and exit lines of high-speed railway sections, and it is required to detect the damage of the slider with high precision. It can be seen that the good condition of the pantograph slider is very important for the normal operation of the railway system. As a part of providing power for high-speed rail and subway, the pantograph must be paid attention to in railway transportation to ensure its integrity. The wear of the pantograph is mainly due to the contact power supply between the slide block and the long wire during high-speed operation, which inevitably produces scratches, resulting in depressions on the upper surface of the pantograph slide block. During long-term use, because the depression is too deep, there is a risk of fracture. Therefore, it is necessary to monitor the slider regularly and replace the slider with serious wear. At present, most of the traditional methods use automation technology or simple computer vision technology for detection, which is inefficient. Therefore, this paper introduces computer vision and deep learning technology into pantograph slide wear detection. Specifically, this paper mainly studies the wear detection of the pantograph slider based on deep learning and the main purpose is to improve the detection accuracy and improve the effect of segmentation. From a methodological perspective, this paper employs a linear array camera to enhance the quality of the data sets. Additionally, it integrates an attention mechanism to improve segmentation performance. Furthermore, this study introduces a novel image stitching method to address issues related to incomplete images, thereby providing a comprehensive solution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:开发一种用于对比增强CT上的实体肾肿瘤的自动分割模型,并以相关的置信度可视化分割,以促进临床适用性。
    方法:训练数据集包括来自两个三级中心的实体肾肿瘤患者,这些患者正在接受手术切除,并在皮质髓质或肾源性造影剂(CM)阶段接受CT。在所有轴向CT切片上进行手动肿瘤分割,作为自动分割的参考标准。对公开可用的KiTS2019数据集进行了独立测试。神经网络集合(ENN,DeepLabV3)用于自动肾肿瘤分割,他们的表现用DICE评分量化。ENN平均前景熵测量分割置信度(二进制:DICE评分>0.8的成功分割与不充分分割≤0.8)。
    结果:N=639/n=210名患者被纳入训练和独立测试数据集。数据集在年龄和性别方面具有可比性(p>0.05),而训练数据集中的肾脏肿瘤更大,更常见的良性(p<0.01)。在内部测试数据集中,ENN模型的中位数DICE评分=0.84(IQR:0.62-0.97,皮质髓质)和0.86(IQR:0.77-0.96,肾源性CM期),和分割置信度AUC=0.89(灵敏度=0.86;特异性=0.77)。在独立测试数据集中,ENN模型的中位DICE评分=0.84(IQR:0.71~0.97,皮质髓质CM期);分割置信度和准确度=0.84(敏感性=0.86,特异性=0.81).通过叠加在临床CT图像上的颜色编码的体素肿瘤概率和阈值来可视化ENN分割。
    结论:基于ENN的肾肿瘤分割在外部测试数据中表现强劲,可能有助于肾肿瘤分类和治疗计划。
    结论:神经网络(ENN)模型集合可以自动分割常规CT上的肾肿瘤,启用和标准化下游图像分析和治疗计划。在图像上提供置信度度量和分割叠加可以降低临床ENN实施的阈值。
    结论:神经网络集合(ENN)分割通过颜色编码的体素肿瘤概率和阈值可视化。ENN在内部测试和独立的外部测试数据集中提供了很高的分割精度。ENN模型提供了分割置信度的度量,可以可靠地区分成功和不充分的分割。
    OBJECTIVE: To develop an automatic segmentation model for solid renal tumors on contrast-enhanced CTs and to visualize segmentation with associated confidence to promote clinical applicability.
    METHODS: The training dataset included solid renal tumor patients from two tertiary centers undergoing surgical resection and receiving CT in the corticomedullary or nephrogenic contrast media (CM) phase. Manual tumor segmentation was performed on all axial CT slices serving as reference standard for automatic segmentations. Independent testing was performed on the publicly available KiTS 2019 dataset. Ensembles of neural networks (ENN, DeepLabV3) were used for automatic renal tumor segmentation, and their performance was quantified with DICE score. ENN average foreground entropy measured segmentation confidence (binary: successful segmentation with DICE score > 0.8 versus inadequate segmentation ≤ 0.8).
    RESULTS: N = 639/n = 210 patients were included in the training and independent test dataset. Datasets were comparable regarding age and sex (p > 0.05), while renal tumors in the training dataset were larger and more frequently benign (p < 0.01). In the internal test dataset, the ENN model yielded a median DICE score = 0.84 (IQR: 0.62-0.97, corticomedullary) and 0.86 (IQR: 0.77-0.96, nephrogenic CM phase), and the segmentation confidence an AUC = 0.89 (sensitivity = 0.86; specificity = 0.77). In the independent test dataset, the ENN model achieved a median DICE score = 0.84 (IQR: 0.71-0.97, corticomedullary CM phase); and segmentation confidence an accuracy = 0.84 (sensitivity = 0.86 and specificity = 0.81). ENN segmentations were visualized with color-coded voxelwise tumor probabilities and thresholds superimposed on clinical CT images.
    CONCLUSIONS: ENN-based renal tumor segmentation robustly performs in external test data and might aid in renal tumor classification and treatment planning.
    CONCLUSIONS: Ensembles of neural networks (ENN) models could automatically segment renal tumors on routine CTs, enabling and standardizing downstream image analyses and treatment planning. Providing confidence measures and segmentation overlays on images can lower the threshold for clinical ENN implementation.
    CONCLUSIONS: Ensembles of neural networks (ENN) segmentation is visualized by color-coded voxelwise tumor probabilities and thresholds. ENN provided a high segmentation accuracy in internal testing and in an independent external test dataset. ENN models provide measures of segmentation confidence which can robustly discriminate between successful and inadequate segmentations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    超声心动图是诊断心脏疾病的重要工具,超声心动图视频中准确的左心室(LV)分割对于评估心功能至关重要。然而,由于视频的语义分割需要考虑帧之间的时间相关性,这使得这项任务非常具有挑战性。本文介绍了一种创新的方法,该方法将经过修改的混合注意力机制集成到SegFormer体系结构中,使其能够有效地掌握视频数据中存在的时间相关性。所提出的方法通过将输入到编码器中的图像编码来处理每个时间序列以获得当前时间特征图。这张地图,连同历史时间特征图,然后被馈送到时间敏感的混合注意机制类型的卷积块注意模块(TCBAM)中。其输出可以作为后续序列的历史时间特征图,以及当前序列的当前时间特征图和历史时间特征图的组合。然后将处理后的特征图输入到多层感知器(MLP)和后续网络中以生成最终的分割图像。通过对两个不同数据集进行的广泛实验:哈马德医疗公司,坦佩雷大学,卡塔尔大学(HMC-QU),用于多结构超声分割(CAMUS)和Sunnybrook心脏数据(SCD)的心脏采集,在SCD数据集上实现97.92%的Dice系数,在CAMUS数据集上实现0.9263的F1评分,表现优于所有其他型号。这项研究为使用基于变压器的模型的视频语义分割任务中的时间建模挑战提供了一个有希望的解决方案,并为该领域的未来研究指出了一个有希望的方向。
    Echocardiography is a key tool for the diagnosis of cardiac diseases, and accurate left ventricular (LV) segmentation in echocardiographic videos is crucial for the assessment of cardiac function. However, since semantic segmentation of video needs to take into account the temporal correlation between frames, this makes the task very challenging. This article introduces an innovative method that incorporates a modified mixed attention mechanism into the SegFormer architecture, enabling it to effectively grasp the temporal correlation present in video data. The proposed method processes each time series by encoding the image input into the encoder to obtain the current time feature map. This map, along with the historical time feature map, is then fed into a time-sensitive mixed attention mechanism type of convolution block attention module (TCBAM). Its output can serve as the historical time feature map for the subsequent sequence, and a combination of the current time feature map and historical time feature map for the current sequence. The processed feature map is then input into the Multilayer Perceptron (MLP) and subsequent networks to generate the final segmented image. Through extensive experiments conducted on two different datasets: Hamad Medical Corporation, Tampere University, and Qatar University (HMC-QU), Cardiac Acquisitions for Multi-structure Ultrasound Segmentation (CAMUS) and Sunnybrook Cardiac Data (SCD), achieving a Dice coefficient of 97.92 % on the SCD dataset and an F1 score of 0.9263 on the CAMUS dataset, outperforming all other models. This research provides a promising solution to the temporal modeling challenge in video semantic segmentation tasks using transformer-based models and points out a promising direction for future research in this field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:荧光显微镜(FM)是一种重要且广泛采用的生物成像技术。分割通常是FM图像定量分析的第一步。深度神经网络(DNN)已成为图像分割的最先进工具。然而,在某些图像损坏或对抗性攻击下,它们在自然图像上的表现可能会崩溃。这给它们在实际应用程序中的部署带来了真正的风险。尽管DNN模型在分割自然图像中的鲁棒性已经得到了广泛的研究,他们在分割调频图像方面的鲁棒性仍然知之甚少。结果:为了解决这一缺陷,我们已经开发了一种测定法,该测定法使用具有精确控制的损坏或对抗性攻击的真实合成2DFM图像的数据集,对DNN分割模型的鲁棒性进行基准测试。使用这种方法,我们对DeepLab和VisionTransformer等十种代表性模型进行了鲁棒性基准测试。我们发现,在自然图像上具有良好鲁棒性的模型在FM图像上可能表现不佳。我们还发现了DNN模型的新的鲁棒性,以及它们的腐败鲁棒性和对抗性鲁棒性之间的新联系。为了进一步评估所选模型的稳健性,我们还对它们进行了不同模态的真实显微镜图像的基准测试,而不使用模拟降解。结果与在逼真的合成图像上获得的结果一致,确认我们的图像合成方法的保真度和可靠性以及我们的测定的有效性。
    结论:基于全面的基准测试实验,我们发现深度神经网络在FM图像语义分割中具有不同的鲁棒性。根据调查结果,我们对FM图像分割的鲁棒模型的选择和设计提出了具体建议。
    BACKGROUND: Fluorescence microscopy (FM) is an important and widely adopted biological imaging technique. Segmentation is often the first step in quantitative analysis of FM images. Deep neural networks (DNNs) have become the state-of-the-art tools for image segmentation. However, their performance on natural images may collapse under certain image corruptions or adversarial attacks. This poses real risks to their deployment in real-world applications. Although the robustness of DNN models in segmenting natural images has been studied extensively, their robustness in segmenting FM images remains poorly understood RESULTS: To address this deficiency, we have developed an assay that benchmarks robustness of DNN segmentation models using datasets of realistic synthetic 2D FM images with precisely controlled corruptions or adversarial attacks. Using this assay, we have benchmarked robustness of ten representative models such as DeepLab and Vision Transformer. We find that models with good robustness on natural images may perform poorly on FM images. We also find new robustness properties of DNN models and new connections between their corruption robustness and adversarial robustness. To further assess the robustness of the selected models, we have also benchmarked them on real microscopy images of different modalities without using simulated degradation. The results are consistent with those obtained on the realistic synthetic images, confirming the fidelity and reliability of our image synthesis method as well as the effectiveness of our assay.
    CONCLUSIONS: Based on comprehensive benchmarking experiments, we have found distinct robustness properties of deep neural networks in semantic segmentation of FM images. Based on the findings, we have made specific recommendations on selection and design of robust models for FM image segmentation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近年来,自动驾驶汽车(AV\s)在车辆技术中获得了更多的普及。为了安全和安全驾驶的发展,这些AV有助于减少诸如撞车之类的不确定性,交通繁忙,行人行为,随机对象,车道检测,不同类型的道路及其周围环境。在AV\'s中,车道检测是最重要的方面之一,有助于车道保持引导和车道偏离预警。从文学,据观察,现有的深度学习模型在维护良好的道路和有利的天气条件下表现更好。然而,在极端天气条件和弯曲道路下的表现需要重点关注。拟议的工作重点是在不良道路上提出准确的车道检测方法,尤其是那些有曲线的,破碎的车道,或者没有车道标记和极端天气条件。提出了采用卷积注意机制(LD-CAM)模型的车道检测来实现这一结果。所提出的方法包括编码器,增强的卷积块注意力模块(E-CBAM),还有一个解码器.编码器单元提取输入图像特征,而E-CBAM专注于从编码器提取的输入图像中的特征图的质量,解码器提供输出而不丢失原始图像中的任何信息。这项工作是使用来自三个名为Tusimple的数据集的不同天气状况图像的不同数据进行的。不同曲线车道图像的曲线车道和受损道路图像的裂缝和坑洼。使用这些数据集训练的模型显示出改进的性能,达到97.90%的准确性,精度98.92%,F1-评分为97.90%,在极端天气条件下,在结构化和有缺陷的道路上,IoU为98.50%,DiceCo-efficient为98.80%。
    Autonomous Vehicles (AV\'s) have achieved more popularity in vehicular technology in recent years. For the development of secure and safe driving, these AV\'s help to reduce the uncertainties such as crashes, heavy traffic, pedestrian behaviours, random objects, lane detection, different types of roads and their surrounding environments. In AV\'s, Lane Detection is one of the most important aspects which helps in lane holding guidance and lane departure warning. From Literature, it is observed that existing deep learning models perform better on well maintained roads and in favourable weather conditions. However, performance in extreme weather conditions and curvy roads need focus. The proposed work focuses on presenting an accurate lane detection approach on poor roads, particularly those with curves, broken lanes, or no lane markings and extreme weather conditions. Lane Detection with Convolutional Attention Mechanism (LD-CAM) model is proposed to achieve this outcome. The proposed method comprises an encoder, an enhanced convolution block attention module (E-CBAM), and a decoder. The encoder unit extracts the input image features, while the E-CBAM focuses on quality of feature maps in input images extracted from the encoder, and the decoder provides output without loss of any information in the original image. The work is carried out using the distinct data from three datasets called Tusimple for different weather condition images, Curve Lanes for different curve lanes images and Cracks and Potholes for damaged road images. The proposed model trained using these datasets showcased an improved performance attaining an Accuracy of 97.90%, Precision of 98.92%, F1-Score of 97.90%, IoU of 98.50% and Dice Co-efficient as 98.80% on both structured and defective roads in extreme weather conditions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    通常,图像分割任务的深度学习模型是使用在像素级别注释的大型图像数据集进行训练的,这可能是昂贵且非常耗时的。减少训练所需的注释图像量的一种方法是采用半监督方法。在这方面,生成深度学习模型,具体生成对抗网络(GAN),已经适应了分割任务的半监督训练。这项工作提出了MaskGDM,一种深度学习架构,结合了EditGAN的一些想法,一个GAN,它共同对图像及其分割进行建模,以及生成扩散模型。经过精心整合,我们发现,使用生成扩散模型可以提高EditGAN在多个分割数据集中的性能结果,多类和二进制标签。根据获得的定量结果,与EditGAN和DatasetGAN模型相比,该模型改进了多类图像分割,分别,通过[公式:见文本]和[公式:见文本]。此外,使用ISIC数据集,对于二进制图像分割方法,我们的建议将其他模型的结果改进了[公式:见文本]。
    Typically, deep learning models for image segmentation tasks are trained using large datasets of images annotated at the pixel level, which can be expensive and highly time-consuming. A way to reduce the amount of annotated images required for training is to adopt a semi-supervised approach. In this regard, generative deep learning models, concretely Generative Adversarial Networks (GANs), have been adapted to semi-supervised training of segmentation tasks. This work proposes MaskGDM, a deep learning architecture combining some ideas from EditGAN, a GAN that jointly models images and their segmentations, together with a generative diffusion model. With careful integration, we find that using a generative diffusion model can improve EditGAN performance results in multiple segmentation datasets, both multi-class and with binary labels. According to the quantitative results obtained, the proposed model improves multi-class image segmentation when compared to the EditGAN and DatasetGAN models, respectively, by [Formula: see text] and [Formula: see text]. Moreover, using the ISIC dataset, our proposal improves the results from other models by up to [Formula: see text] for the binary image segmentation approach.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    一个多世纪以来,皮肤表面成像已被用于用显微镜检查皮肤病变,通常被称为外延发光显微镜,皮肤镜检查,或者皮肤镜检查.已建议皮肤表面显微镜检查以减少活检的必要性。这种成像技术可以提高色素性皮肤病变的临床诊断性能。在皮肤病学中采用不同的成像技术来发现疾病。分割和分类是检查中的两个主要步骤。分类性能受到分割过程中采用的算法的影响。分割最困难的方面是消除不需要的伪影。正在创建许多深度学习模型来分割皮肤病变。在本文中,提出了对常见伪影的分析,以研究皮肤表面显微图像的深度学习模型的分割性能。皮肤图像中最普遍的伪影是头发和黑角。可以在通过各种成像技术捕获的大多数皮肤镜检查图像中观察到这些伪影。虽然毛发检测和去除方法很常见,暗角检测和去除的引入代表了一种新颖的皮肤病变分割方法。使用伪影的表面密度评估对这种分割性能的综合分析。对PH2、ISIC2017和ISIC2018数据集的评估显示出显著的增强,如骰子系数上升至93.49(86.81)所反映,85.86(79.91),和75.38(51.28),在去除伪影后。这些结果强调了伪影去除技术在放大深度学习模型对皮肤病变分割的功效方面的关键意义。
    Skin surface imaging has been used to examine skin lesions with a microscope for over a century and is commonly known as epiluminescence microscopy, dermatoscopy, or dermoscopy. Skin surface microscopy has been recommended to reduce the necessity of biopsy. This imaging technique could improve the clinical diagnostic performance of pigmented skin lesions. Different imaging techniques are employed in dermatology to find diseases. Segmentation and classification are the two main steps in the examination. The classification performance is influenced by the algorithm employed in the segmentation procedure. The most difficult aspect of segmentation is getting rid of the unwanted artifacts. Many deep-learning models are being created to segment skin lesions. In this paper, an analysis of common artifacts is proposed to investigate the segmentation performance of deep learning models with skin surface microscopic images. The most prevalent artifacts in skin images are hair and dark corners. These artifacts can be observed in the majority of dermoscopy images captured through various imaging techniques. While hair detection and removal methods are common, the introduction of dark corner detection and removal represents a novel approach to skin lesion segmentation. A comprehensive analysis of this segmentation performance is assessed using the surface density of artifacts. Assessment of the PH2, ISIC 2017, and ISIC 2018 datasets demonstrates significant enhancements, as reflected by Dice coefficients rising to 93.49 (86.81), 85.86 (79.91), and 75.38 (51.28) respectively, upon artifact removal. These results underscore the pivotal significance of artifact removal techniques in amplifying the efficacy of deep-learning models for skin lesion segmentation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号