Image generation

图像生成
  • 文章类型: Journal Article
    目的:这项研究的目的是通过将最新的生成对抗网络(GAN;StyleGAN3)应用于全景射线照相来生成包括牙科囊肿在内的X射线照片。
    方法:共选择459个囊性病变,并随机分配409张图像作为训练数据,50张图像作为测试数据。对50万张图像进行StyleGAN3训练。根据四个指标,通过将50个生成的图像与50个真实图像进行比较来客观评估:Fréchet起始距离(FID),内核起始距离(KID),精确度和召回率,和开始得分(IS)。三位专家对生成的图像进行了主观评估,他们在视觉图灵测试中将其与真实图像进行了比较。
    结果:指标结果如下:FID,199.28;孩子,0.14;精度,0.0047;召回,0.00;以及IS,2.48.视觉图灵测试的总体结果为82.3%。在根吸收的人类评分中没有发现显着差异。
    结论:由StyleGAN3生成的图像质量如此之高,以至于专家无法将其与真实图像区分开来。
    OBJECTIVE: The purpose of this study was to generate radiographs including dentigerous cysts by applying the latest generative adversarial network (GAN; StyleGAN3) to panoramic radiography.
    METHODS: A total of 459 cystic lesions were selected, and 409 images were randomly assigned as training data and 50 images as test data. StyleGAN3 training was performed for 500 000 images. Fifty generated images were objectively evaluated by comparing them with 50 real images according to four metrics: Fréchet inception distance (FID), kernel inception distance (KID), precision and recall, and inception score (IS). A subjective evaluation of the generated images was performed by three specialists who compared them with the real images in a visual Turing test.
    RESULTS: The results of the metrics were as follows: FID, 199.28; KID, 0.14; precision, 0.0047; recall, 0.00; and IS, 2.48. The overall results of the visual Turing test were 82.3%. No significant difference was found in the human scoring of root resorption.
    CONCLUSIONS: The images generated by StyleGAN3 were of such high quality that specialists could not distinguish them from the real images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    纤维化,细胞外基质蛋白的病理性增加,是一个严重的健康问题,阻碍了体内许多器官的功能,在某些情况下是致命的。在心中,纤维化以复杂且难以预测的方式影响电传播,可能作为危险心律失常的底物。个体风险取决于纤维化组织的空间表现,为了预测这些影响,在精细尺度上学习空间排列仍然依赖于侵入性离体程序。因此,空间变异性对心脏纤维化症状影响的影响仍然知之甚少.在这项工作中,我们通过一种计算方法来解决此类成像数据的可用性问题,该计算方法用于生成心脏纤维化微观结构的新实现。使用计算机图形学中的Perlin噪声技术,再加上只需要一张训练图像的自动校准过程,我们证明了在组织学切片中观察到的四种类型的纤维化微观结构中成功捕获了胶原蛋白纹理。然后我们使用这个发生器来定量分析这些不同类型的心脏纤维化的传导特性,以及产生组织学观察到的图案的三维实现。由于发生器的灵活性和自动校准过程,我们还预计,它可能有助于产生其他生理结构的额外实现。
    Fibrosis, a pathological increase in extracellular matrix proteins, is a significant health issue that hinders the function of many organs in the body, in some cases fatally. In the heart, fibrosis impacts on electrical propagation in a complex and poorly predictable fashion, potentially serving as a substrate for dangerous arrhythmias. Individual risk depends on the spatial manifestation of fibrotic tissue, and learning the spatial arrangement on the fine scale in order to predict these impacts still relies upon invasive ex vivo procedures. As a result, the effects of spatial variability on the symptomatic impact of cardiac fibrosis remain poorly understood. In this work, we address the issue of availability of such imaging data via a computational methodology for generating new realisations of cardiac fibrosis microstructure. Using the Perlin noise technique from computer graphics, together with an automated calibration process that requires only a single training image, we demonstrate successful capture of collagen texturing in four types of fibrosis microstructure observed in histological sections. We then use this generator to quantitatively analyse the conductive properties of these different types of cardiac fibrosis, as well as produce three-dimensional realisations of histologically-observed patterning. Owing to the generator\'s flexibility and automated calibration process, we also anticipate that it might be useful in producing additional realisations of other physiological structures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    机器学习已被用于识别亚细胞水平的蛋白质定位,这极大地促进了蛋白质功能的研究,特别是对于那些定位在多个细胞器中的多标记蛋白。然而,现有的工作主要研究蛋白质亚细胞位置的定性分类,忽略一个多标记蛋白在不同位置的分数。事实上,大约50%的蛋白质是多标记蛋白质,而对数量信息的无知则极大地限制了对其空间分布和功能机制的理解。缺乏定量研究的原因之一是定量注释的不足。要解决数据短缺问题,在这里我们提出了一个生成模型,普拉克根,这可以生成具有荧光分布的条件定量注释的细胞图像。该模型是一个条件生成对抗网络,其中条件学习利用部分标签学习来克服训练标签的缺乏,并且只允许使用定性标签进行训练。同时,它使用对比学习来增强生成图像的多样性。我们在四个像素融合的合成数据集和一个真实数据集上评估了PLocGAN,并证明了该模型可以生成具有良好保真度和多样性的图像,优于现有的最先进的生成方法。为了验证PLocGAN在蛋白质亚细胞位置定量预测中的实用性,我们用生成的定量图像替换了训练图像,并建立了预测模型,发现它们对定量估计有促进作用。这项工作证明了深度生成模型在生物图像分析中的有效性,并为定量亚细胞蛋白质组学提供了新的解决方案。
    Machine learning has been employed in recognizing protein localization at the subcellular level, which highly facilitates the protein function studies, especially for those multi-label proteins that localize in more than one organelle. However, existing works mostly study the qualitative classification of protein subcellular locations, ignoring fraction of one multi-label protein in different locations. Actually, about 50 % proteins are multi-label proteins, and the ignorance of quantitative information highly restricts the understanding of their spatial distribution and functional mechanism. One reason of the lack of quantitative study is the insufficiency of quantitative annotations. To address the data shortage problem, here we proposed a generative model, PLocGAN, which could generate cell images with conditional quantitative annotation of the fluorescence distribution. The model was a conditional generative adversarial network, in which the condition learning utilized partial label learning to overcome the lack of training labels and allowed training with only qualitative labels. Meanwhile, it used contrastive learning to enhance diversity of the generated images. We assessed the PLocGAN on four pixel-fused synthetic datasets and one real dataset, and demonstrated that the model could generate images with good fidelity and diversity, outperforming existing state-of-the-art generative methods. To verify the utility of PLocGAN in the quantitative prediction of protein subcellular locations, we replaced the training images with generated quantitative images and built prediction models, and found that they had a boosting effect on the quantitative estimation. This work demonstrates the effectiveness of deep generative models in bioimage analysis, and provides a new solution for quantitative subcellular proteomics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:头颈部放射治疗计划需要来自不同组织的电子密度来计算剂量。 来自诸如MRI的成像方式的剂量计算仍然是未解决的问题,因为这种成像方式不提供关于电子密度的信息。 方法。我们提出了一种生成对抗网络(GAN)方法,该方法可以从头颈部癌症患者的T1加权MRI采集中合成CT(sCT)图像。我们的贡献是利用与改进多模态图像合成相关的新功能,从而提高生成的CT图像的质量。更确切地说,我们提出了一种基于U-Net架构和增强多平面分支的双分支生成器。增强分支学习特定的3D动态特征,其描述动态图像形状变化并且从体积输入MRI的不同视点提取。所提出的模型的架构依赖于端到端卷积U-Net嵌入网络。 结果。所提出的模型在矢状头部和颈部患者的目标Hounsfield单位(HU)空间中实现了$18.76(5.167)$的平均绝对误差(MAE),平均结构相似性(MSSIM)为0.95美元(0.09)美元,Frechet起始距离(FID)为145.60美元(8.38)美元。该模型产生$26.83(8.27)$的MAE,以在轴向患者采集时生成特定的原发性肿瘤区域,骰子得分为$0.73(0.06)$,FID距离等于$122.58(7.55)$。与其他最先进的GAN方法相比,我们的模型的改进为3.8美元,在肿瘤测试装置上。在矢状和轴向采集上,该模型产生最佳的峰值信噪比(PSNRs)$27.89(2.22)$和$26.08(2.95)$从CT输入合成MRI。 意义。所提出的模型综合了矢状和轴向CT肿瘤图像,用于头颈部癌症病例的放射治疗计划。不同成像指标和不同评估策略下的性能分析证明了我们的双CT合成模型与其他最先进的方法相比产生高质量sCT图像的有效性。我们的模型可以改善临床肿瘤分析,其中进一步的临床验证仍有待探索。
    Objective.Head and neck radiotherapy planning requires electron densities from different tissues for dose calculation. Dose calculation from imaging modalities such as MRI remains an unsolved problem since this imaging modality does not provide information about the density of electrons.Approach.We propose a generative adversarial network (GAN) approach that synthesizes CT (sCT) images from T1-weighted MRI acquisitions in head and neck cancer patients. Our contribution is to exploit new features that are relevant for improving multimodal image synthesis, and thus improving the quality of the generated CT images. More precisely, we propose a Dual branch generator based on the U-Net architecture and on an augmented multi-planar branch. The augmented branch learns specific 3D dynamic features, which describe the dynamic image shape variations and are extracted from different view-points of the volumetric input MRI. The architecture of the proposed model relies on an end-to-end convolutional U-Net embedding network.Results.The proposed model achieves a mean absolute error (MAE) of18.76±5.167in the target Hounsfield unit (HU) space on sagittal head and neck patients, with a mean structural similarity (MSSIM) of0.95±0.09and a Frechet inception distance (FID) of145.60±8.38. The model yields a MAE of26.83±8.27to generate specific primary tumor regions on axial patient acquisitions, with a Dice score of0.73±0.06and a FID distance equal to122.58±7.55. The improvement of our model over other state-of-the-art GAN approaches is of 3.8%, on a tumor test set. On both sagittal and axial acquisitions, the model yields the best peak signal-to-noise ratio of27.89±2.22and26.08±2.95to synthesize MRI from CT input.Significance.The proposed model synthesizes both sagittal and axial CT tumor images, used for radiotherapy treatment planning in head and neck cancer cases. The performance analysis across different imaging metrics and under different evaluation strategies demonstrates the effectiveness of our dual CT synthesis model to produce high quality sCT images compared to other state-of-the-art approaches. Our model could improve clinical tumor analysis, in which a further clinical validation remains to be explored.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本文展示了一种基于内容的图像检索(CBIR)的新方法,将焦点从传统的特定领域的图像查询转移到更复杂的基于文本的查询处理。潜在扩散模型用于解释复杂的文本提示,并解决有效解释复杂文本查询的要求。潜在扩散模型成功地将复杂的文本查询转换为视觉上引人入胜的表示,在文本描述和视觉内容之间建立无缝连接。自定义三元组网络设计是我们检索方法的核心。如果训练有素,三元组网络将表示生成的查询图像和数据库中的不同图像。余弦相似性度量用于评估特征表示之间的相似性,以便找到和检索相关图像。我们的实验结果表明,潜在扩散模型可以成功地弥合图像检索的复杂文本提示之间的差距,而不依赖于附加到数据库图像的标签或元数据。这一进步为图像检索的未来探索奠定了基础,利用生成AI功能来满足大数据和复杂查询解释的不断变化的需求。
    The paper demonstrates a novel methodology for Content-Based Image Retrieval (CBIR), which shifts the focus from conventional domain-specific image queries to more complex text-based query processing. Latent diffusion models are employed to interpret complex textual prompts and address the requirements of effectively interpreting the complex textual query. Latent Diffusion models successfully transform complex textual queries into visually engaging representations, establishing a seamless connection between textual descriptions and visual content. Custom triplet network design is at the heart of our retrieval method. When trained well, a triplet network will represent the generated query image and the different images in the database. The cosine similarity metric is used to assess the similarity between the feature representations in order to find and retrieve the relevant images. Our experiments results show that latent diffusion models can successfully bridge the gap between complex textual prompts for image retrieval without relying on labels or metadata that are attached to database images. This advancement sets the stage for future explorations in image retrieval, leveraging the generative AI capabilities to cater to the ever-evolving demands of big data and complex query interpretations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    鉴于人们越来越担心由于人工智能技术的广泛使用而滥用个人数据,有必要实施强大的隐私保护方法。然而,现有的保护面部隐私的方法存在视觉质量差等问题,失真和有限的可重用性。为了应对这一挑战,我们提出了一种新的方法,称为面部隐私保护扩散模型(DIFP)。我们的方法利用了一个有条件控制和现实引导的人脸生成器,以产生逼真的高分辨率加密人脸,同时保留原始面部信息的自然性和可恢复性。我们采用两阶段培训策略,在身份和风格指导下生成受保护的面孔,其次是改进潜在变量以增强真实感的迭代技术。此外,我们引入扩散模型去噪用于身份恢复,这有助于在需要时删除加密和恢复原始面孔。实验结果证明了我们的方法在定性隐私保护中的有效性,在回避面部识别工具方面实现了很高的成功率,并实现了对遮挡面部的近乎完美的恢复。
    In light of growing concerns about the misuse of personal data resulting from the widespread use of artificial intelligence technology, it is necessary to implement robust privacy-protection methods. However, existing methods for protecting facial privacy suffer from issues such as poor visual quality, distortion and limited reusability. To tackle this challenge, we propose a novel approach called Diffusion Models for Face Privacy Protection (DIFP). Our method utilizes a face generator that is conditionally controlled and reality-guided to produce high-resolution encrypted faces that are photorealistic while preserving the naturalness and recoverability of the original facial information. We employ a two-stage training strategy to generate protected faces with guidance on identity and style, followed by an iterative technique for improving latent variables to enhance realism. Additionally, we introduce diffusion model denoising for identity recovery, which facilitates the removal of encryption and restoration of the original face when required. Experimental results demonstrate the effectiveness of our method in qualitative privacy protection, achieving high success rates in evading face-recognition tools and enabling near-perfect restoration of occluded faces.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:基于图像的作物生长建模可以通过揭示空间作物随时间的发展来为精准农业做出重大贡献,它允许对相关的未来植物性状进行早期和特定位置的估计,如叶面积或生物量。生成现实而清晰的作物图像的前提是将多个生长影响条件集成到一个模型中,例如初始生长阶段的图像,相关的生长时间,以及有关现场治疗的更多信息。虽然基于图像的模型比基于过程的模型为作物生长建模提供了更大的灵活性,在各种影响增长的条件的综合整合方面仍然存在很大的研究差距。需要进一步的探索和调查来解决这一差距。
    方法:我们提出了一个两阶段框架,该框架由第一个图像生成模型和第二个增长估计模型组成,独立训练。图像生成模型是有条件的Wasserstein生成对抗网络(CWGAN)。在此模型的生成器中,条件批量归一化(CBN)用于集成不同类型的条件以及输入图像。这允许模型根据多个影响因素生成时变人工图像。框架的第二部分通过得出植物特定的性状并将其与非人工(真实)参考图像的性状进行比较来使用这些图像进行植物表型分析。此外,使用多尺度结构相似性(MS-SSIM)评估图像质量,学习感知图像块相似性(LPIPS),和Fréchet起始距离(FID)。在推理过程中,该框架允许为训练中使用的任何条件组合生成图像;我们称这种生成为数据驱动的作物生长模拟。
    结果:实验是在三个不同复杂度的数据集上进行的。这些数据集包括实验室植物拟南芥(拟南芥)和在实际田间条件下生长的作物,即花椰菜(GrowliFlower)和由蚕豆和春小麦(MixedCrop)组成的作物混合物。在所有情况下,该框架允许现实的,清晰的图像世代,从短期到长期预测的质量略有下降。对于在不同处理下生长的混合作物(不同品种,播种密度),结果表明,添加这些处理信息增加了一代质量和表型的准确性测量的估计生物量。用受过训练的框架对不同的生长影响条件进行模拟,为这些因素如何与作物外观相关提供了有价值的见解。这在复杂的情况下特别有用,探索较少的作物混合物系统。进一步的结果表明,添加基于过程的模拟生物量作为条件增加了来自预测图像的衍生表型性状的准确性。这证明了我们的框架作为数据驱动和基于过程的作物生长模型之间的接口的潜力。
    结论:通过多条件CWGAN,对未来植物外观的真实生成和模拟是充分可行的。提出的框架补充了基于流程的模型,克服了它们的局限性,例如对假设的依赖和低精确的现场定位特异性,通过对空间作物发育的逼真可视化,直接导致模型预测的高度可解释性。
    BACKGROUND: Image-based crop growth modeling can substantially contribute to precision agriculture by revealing spatial crop development over time, which allows an early and location-specific estimation of relevant future plant traits, such as leaf area or biomass. A prerequisite for realistic and sharp crop image generation is the integration of multiple growth-influencing conditions in a model, such as an image of an initial growth stage, the associated growth time, and further information about the field treatment. While image-based models provide more flexibility for crop growth modeling than process-based models, there is still a significant research gap in the comprehensive integration of various growth-influencing conditions. Further exploration and investigation are needed to address this gap.
    METHODS: We present a two-stage framework consisting first of an image generation model and second of a growth estimation model, independently trained. The image generation model is a conditional Wasserstein generative adversarial network (CWGAN). In the generator of this model, conditional batch normalization (CBN) is used to integrate conditions of different types along with the input image. This allows the model to generate time-varying artificial images dependent on multiple influencing factors. These images are used by the second part of the framework for plant phenotyping by deriving plant-specific traits and comparing them with those of non-artificial (real) reference images. In addition, image quality is evaluated using multi-scale structural similarity (MS-SSIM), learned perceptual image patch similarity (LPIPS), and Fréchet inception distance (FID). During inference, the framework allows image generation for any combination of conditions used in training; we call this generation data-driven crop growth simulation.
    RESULTS: Experiments are performed on three datasets of different complexity. These datasets include the laboratory plant Arabidopsis thaliana (Arabidopsis) and crops grown under real field conditions, namely cauliflower (GrowliFlower) and crop mixtures consisting of faba bean and spring wheat (MixedCrop). In all cases, the framework allows realistic, sharp image generations with a slight loss of quality from short-term to long-term predictions. For MixedCrop grown under varying treatments (different cultivars, sowing densities), the results show that adding these treatment information increases the generation quality and phenotyping accuracy measured by the estimated biomass. Simulations of varying growth-influencing conditions performed with the trained framework provide valuable insights into how such factors relate to crop appearances, which is particularly useful in complex, less explored crop mixture systems. Further results show that adding process-based simulated biomass as a condition increases the accuracy of the derived phenotypic traits from the predicted images. This demonstrates the potential of our framework to serve as an interface between a data-driven and a process-based crop growth model.
    CONCLUSIONS: The realistic generation and simulation  of future plant appearances is adequately feasible by multi-conditional CWGAN. The presented framework complements process-based models and overcomes their limitations, such as the reliance on assumptions and the low exact field-localization specificity, by realistic visualizations of the spatial crop development that directly lead to a high explainability of the model predictions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    致密油油藏的铸薄剖面包含重要参数,如岩石矿物成分和含量,孔隙度,渗透率和地层特征,对储层评价具有重要意义。利用深度学习技术对薄片图像进行智能识别是矿物识别的发展趋势。然而,制作薄铸件的难度,制作过程的复杂性和薄切片注释的高成本导致缺乏铸造薄切片图像,不能满足深度学习图像识别模型的训练要求。为了增加深度学习模型的样本量,提高训练效果,提出了一种基于深度学习的致密油储层薄片图像生成与标注方法,以三赵凹陷扶余水库为目标区域。首先,利用增强策略空间对原始图像进行初步增强,同时保留原始图像特征,以满足模型的要求。其次,在原始StyleGAN网络中添加了类别注意机制,以避免薄片中组件数量不均对生成图像质量的影响。然后,设计了SALM标注模块,实现对生成图像的半自动标注。最后,图像清晰度实验,失真,的精度和标注效率,验证了该方法在图像质量和标注效率方面的优势。
    The cast thin sections of tight oil reservoirs contain important parameters such as rock mineral composition and content, porosity, permeability and stratigraphic characteristics, which are of great significance for reservoir evaluation. The use of deep learning technology for intelligent identification of thin section images is a development trend of mineral identification. However, the difficulty of making cast thin sections, the complexity of the making process and the high cost of thin section annotation have led to a lack of cast thin section images, which cannot meet the training requirements of deep learning image recognition models. In order to increase the sample size and improve the training effect of deep learning model, we proposed a generation and annotation method of thin section images of tight oil reservoir based on deep learning, by taking Fuyu reservoir in Sanzhao Sag as the target area. Firstly, the Augmentor strategy space was used to preliminarily augment the original images while preserving the original image features to meet the requirements of the model. Secondly, the category attention mechanism was added to the original StyleGAN network to avoid the influence of the uneven number of components in thin sections on the quality of the generated images. Then, the SALM annotation module was designed to achieve semi-automatic annotation of the generated images. Finally, experiments on image sharpness, distortion, standard accuracy and annotation efficiency were designed to verify the advantages of the method in image quality and annotation efficiency.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    从脑电图(EEG)信号生成图像已成为近年来研究的热门研究课题,因为它可以弥合大脑信号与视觉刺激之间的鸿沟,在神经科学和计算机视觉中具有广泛的应用前景。然而,由于脑电信号的高度复杂性,通过EEG信号重建视觉刺激仍然是一个挑战。在这项工作中,我们提出了一个EEG-ConDiffusion框架,包括三个阶段:特征提取,预训练模型的微调,和图像生成。在EEG-ConDiffusion框架中,首先通过特征提取块获得脑电信号的分类特征。然后,以分类特征为条件,对图像生成块中的稳定扩散模型进行微调,生成具有相应语义的图像。该框架结合了EEG分类和图像生成手段,以提高生成图像的质量。我们提出的框架在基于EEG的视觉分类数据集上进行了测试。我们框架的性能是通过分类精度来衡量的,50向top-k精度,和开始得分。结果表明,所提出的EEG-Condiffusion框架能够提取有效的分类特征,并从EEG信号中生成高质量的图像,实现EEG到图像的转换。
    The generation of images from electroencephalography (EEG) signals has become a popular research topic in recent research because it can bridge the gap between brain signals and visual stimuli and has wide application prospects in neuroscience and computer vision. However, due to the high complexity of EEG signals, the reconstruction of visual stimuli through EEG signals continues to pose a challenge. In this work, we propose an EEG-ConDiffusion framework that involves three stages: feature extraction, fine-tuning of the pretrained model, and image generation. In the EEG-ConDiffusion framework, classification features of EEG signals are first obtained through the feature extraction block. Then, the classification features are taken as conditions to fine-tune the stable diffusion model in the image generation block to generate images with corresponding semantics. This framework combines EEG classification and image generation means to enhance the quality of generated images. Our proposed framework was tested on an EEG-based visual classification dataset. The performance of our framework is measured by classification accuracy, 50-way top-k accuracy, and inception score. The results indicate that the proposed EEG-Condiffusion framework can extract effective classification features and generate high-quality images from EEG signals to realize EEG-to-image conversion.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:脑机接口(BCI)技术最近发展迅速,为改善人类健康和生活质量带来重大希望。将视觉诱发脑电图(EEG)信号解码并可视化为相应的图像在BCI技术的实际应用中起着至关重要的作用。最近出现的扩散模型为这项工作提供了良好的建模基础。然而,现有的扩散模型在从EEG生成高质量图像方面仍然存在巨大挑战,由于EEG信号的低信噪比和强随机性。这项研究的目的是通过提出一个名为NeuroDM的框架来解决上述挑战,该框架可以从EEG记录的大脑活动中解码人脑对视觉刺激的反应。
    方法:在神经DM中,EEG-视觉变换器(EV-Transformer)用于从EEG信号中提取具有高分类精度的视觉相关特征,然后采用EEG引导的扩散模型(EG-DM)从EEG视觉相关特征合成高质量的图像。
    结果:我们在两个EEG数据集上进行了实验(一个是40类数据集,另一个是四类数据集)。在脑电解码的任务中,我们在两个数据集上实现了99.80%和92.07%的平均准确率,分别。在脑电图可视化的任务中,NeuroDM生成的图像的Inception评分分别达到15.04和8.67。所有上述结果都优于现有方法。
    结论:两个EEG数据集的实验结果证明了NeuroDM框架的有效性,在分类精度和图像质量方面实现最先进的性能。此外,我们的NeuroDM表现出强大的泛化能力和生成不同图像的能力。
    OBJECTIVE: Brain-Computer Interface (BCI) technology has recently been advancing rapidly, bringing significant hope for improving human health and quality of life. Decoding and visualizing visually evoked electroencephalography (EEG) signals into corresponding images plays a crucial role in the practical application of BCI technology. The recent emergence of diffusion models provides a good modeling basis for this work. However, the existing diffusion models still have great challenges in generating high-quality images from EEG, due to the low signal-to-noise ratio and strong randomness of EEG signals. The purpose of this study is to address the above-mentioned challenges by proposing a framework named NeuroDM that can decode human brain responses to visual stimuli from EEG-recorded brain activity.
    METHODS: In NeuroDM, an EEG-Visual-Transformer (EV-Transformer) is used to extract the visual-related features with high classification accuracy from EEG signals, then an EEG-Guided Diffusion Model (EG-DM) is employed to synthesize high-quality images from the EEG visual-related features.
    RESULTS: We conducted experiments on two EEG datasets (one is a forty-class dataset, and the other is a four-class dataset). In the task of EEG decoding, we achieved average accuracies of 99.80% and 92.07% on two datasets, respectively. In the task of EEG visualization, the Inception Score of the images generated by NeuroDM reached 15.04 and 8.67, respectively. All the above results outperform existing methods.
    CONCLUSIONS: The experimental results on two EEG datasets demonstrate the effectiveness of the NeuroDM framework, achieving state-of-the-art performance in terms of classification accuracy and image quality. Furthermore, our NeuroDM exhibits strong generalization capabilities and the ability to generate diverse images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号