Spatial attention

空间注意力
  • 文章类型: Journal Article
    傅里叶重叠显微术(FPM)是一种基于光学原理的显微成像技术。它采用傅立叶光学来分离和组合来自样品的不同光学信息。然而,在成像过程中引入的噪声往往导致重建图像的分辨率较差。本文设计了一种基于残差局部混合网络的方法来提高傅立叶重叠重建图像的质量。通过将通道注意力和空间注意力纳入FPM重建过程,提高了网络重构的效率,减少了重构时间。此外,高斯扩散模型的引入进一步减少了相干伪影,提高了图像重建质量。对比实验结果表明,该网络具有较好的重建质量,在主观观察和客观定量评价方面都优于现有方法。
    Fourier Ptychographic Microscopy (FPM) is a microscopy imaging technique based on optical principles. It employs Fourier optics to separate and combine different optical information from a sample. However, noise introduced during the imaging process often results in poor resolution of the reconstructed image. This article has designed an approach based on a residual local mixture network to improve the quality of Fourier ptychographic reconstruction images. By incorporating channel attention and spatial attention into the FPM reconstruction process, the network enhances the efficiency of the network reconstruction and reduces the reconstruction time. Additionally, the introduction of the Gaussian diffusion model further reduces coherent artifacts and improves image reconstruction quality. Comparative experimental results indicate that this network achieves better reconstruction quality, and outperforming existing methods in both subjective observation and objective quantitative evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项工作旨在通过开发用于实际临床CBCT投影数据的深度学习(DL)方法来改善有限角度(LA)锥形束计算机断层扫描(CBCT),这是第一个基于临床投影数据的LA-CBCT的可行性研究,据我们所知.在放射治疗(RT)中,CBCT通常用作患者设置的机载成像模态。与诊断性CT相比,CBCT具有较长的采集时间,例如,一个完整的360°旋转60秒,受到运动伪影的影响。因此,LA-CBCT,如果可以实现,对RT的目的非常感兴趣,除了辐射剂量外,它还按比例减少了扫描时间。然而,LA-CBCT遭受严重的楔形伪影和图像失真。针对真实的临床预测数据,我们已经探索了各种DL方法,例如图像/数据/混合域方法,并最终开发了一种所谓的结构增强注意力网络(SEA-Net)方法,该方法在我们实施的DL方法中具有来自临床投影数据的最佳图像质量。具体来说,提出的SEA-Net采用专门的结构增强子网络来促进纹理保存。观察到重建图像中楔形伪影的分布是不均匀的,空间注意模块用于强调相关区域,而忽略不相关区域,这导致更准确的纹理恢复。
    This work aims to improve limited-angle (LA) cone beam computed tomography (CBCT) by developing deep learning (DL) methods for real clinical CBCT projection data, which is the first feasibility study of clinical-projection-data-based LA-CBCT, to the best of our knowledge. In radiation therapy (RT), CBCT is routinely used as the on-board imaging modality for patient setup. Compared to diagnostic CT, CBCT has a long acquisition time, e.g., 60 seconds for a full 360° rotation, which is subject to the motion artifact. Therefore, the LA-CBCT, if achievable, is of the great interest for the purpose of RT, for its proportionally reduced scanning time in addition to the radiation dose. However, LA-CBCT suffers from severe wedge artifacts and image distortions. Targeting at real clinical projection data, we have explored various DL methods such as image/data/hybrid-domain methods and finally developed a so-called Structure-Enhanced Attention Network (SEA-Net) method that has the best image quality from clinical projection data among the DL methods we have implemented. Specifically, the proposed SEA-Net employs a specialized structure enhancement sub-network to promote texture preservation. Based on the observation that the distribution of wedge artifacts in reconstruction images is non-uniform, the spatial attention module is utilized to emphasize the relevant regions while ignores the irrelevant ones, which leads to more accurate texture restoration.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    野生荒漠草原的特点是栖息地多样,植物分布不均,植物类之间的相似性,和植物阴影的存在。然而,现有的检测荒漠草原植物物种的模型精度低,需要大量的参数,并招致高昂的计算成本,使它们不适合在这些环境中的工厂识别场景中部署。为了应对这些挑战,本文提出了一种轻量级、快速的植物物种检测系统,称为YOLOv8s-KDT,为复杂的沙漠草原环境量身定制。首先,该模型引入了一种动态卷积KernelWarehouse方法,以降低卷积内核的维数并增加其数量,从而在参数效率和表示能力之间实现更好的平衡。其次,该模型将三元组注意力纳入其特征提取网络,有效地捕捉信道与空间位置的关系,增强模型的特征提取能力。最后,动态探测头的引入解决了与目标探测头和注意力不均匀有关的问题,从而改进目标检测头的表示,同时降低计算成本。实验结果表明,升级后的YOLOv8s-KDT模型能够快速有效地识别荒漠草地植物。与原始模型相比,FLOP下降50.8%,精度提高了4.5%,mAP增加了5.6%。目前,将YOLOv8s-KDT模型部署在宁夏荒漠草原移动植物识别APP和定点生态信息观测平台中。它有助于调查整个宁夏地区的荒漠草原植被分布以及长期观察和跟踪特定地区的植物生态信息,比如大水坑,黄集田,和宁夏的红寺步。
    Wild desert grasslands are characterized by diverse habitats, uneven plant distribution, similarities among plant class, and the presence of plant shadows. However, the existing models for detecting plant species in desert grasslands exhibit low precision, require a large number of parameters, and incur high computational cost, rendering them unsuitable for deployment in plant recognition scenarios within these environments. To address these challenges, this paper proposes a lightweight and fast plant species detection system, termed YOLOv8s-KDT, tailored for complex desert grassland environments. Firstly, the model introduces a dynamic convolutional KernelWarehouse method to reduce the dimensionality of convolutional kernels and increase their number, thus achieving a better balance between parameter efficiency and representation ability. Secondly, the model incorporates triplet attention into its feature extraction network, effectively capturing the relationship between channel and spatial position and enhancing the model\'s feature extraction capabilities. Finally, the introduction of a dynamic detection head tackles the issue related to target detection head and attention non-uniformity, thus improving the representation of the target detection head while reducing computational cost. The experimental results demonstrate that the upgraded YOLOv8s-KDT model can rapidly and effectively identify desert grassland plants. Compared to the original model, FLOPs decreased by 50.8%, accuracy improved by 4.5%, and mAP increased by 5.6%. Currently, the YOLOv8s-KDT model is deployed in the mobile plant identification APP of Ningxia desert grassland and the fixed-point ecological information observation platform. It facilitates the investigation of desert grassland vegetation distribution across the entire Ningxia region as well as long-term observation and tracking of plant ecological information in specific areas, such as Dashuikeng, Huangji Field, and Hongsibu in Ningxia.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    注意力通常被视为精神上的聚光灯,它可以像变焦镜头一样在特定的空间位置进行缩放,并具有中心环绕梯度。这里,我们展示了沿着视觉层次结构的信号传输中注意力聚光灯的神经特征。在视网膜V1和下游区域之间进行了fMRI背景连通性分析,以表征两种注意状态下区域间相互作用的空间分布。我们发现,与分散的注意力相比,焦点注意力增强了背景连通性强度的空间梯度。动态因果模型分析进一步揭示了注意力在V1和语外皮层之间的反馈和前馈连接中的作用。在引发强烈拥挤效应的背景下,注意力在背景连通性配置文件中的影响减弱。我们的发现揭示了通过调节人类视觉皮层早期阶段的反复处理来实现信息传输中与上下文相关的注意力优先顺序。
    Attention is often viewed as a mental spotlight, which can be scaled like a zoom lens at specific spatial locations and features a center-surround gradient. Here, we demonstrate a neural signature of attention spotlight in signal transmission along the visual hierarchy. fMRI background connectivity analysis was performed between retinotopic V1 and downstream areas to characterize the spatial distribution of inter-areal interaction under two attentional states. We found that, compared to diffused attention, focal attention sharpened the spatial gradient in the strength of the background connectivity. Dynamic causal modeling analysis further revealed the effect of attention in both the feedback and feedforward connectivity between V1 and extrastriate cortex. In a context which induced a strong effect of crowding, the effect of attention in the background connectivity profile diminished. Our findings reveal a context-dependent attention prioritization in information transmission via modulating the recurrent processing across the early stages in human visual cortex.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    解决由于潜在的严重影响而导致的准确跌倒事件检测的关键需求,本文介绍了空间信道和池化增强YouOnlyLookOnce版本5小(SCPE-YOLOv5s)模型。跌倒事件由于其变化的尺度和微妙的姿势特征而对检测提出了挑战。为了解决这个问题,SCPE-YOLOv5将空间注意力引入了高效信道注意力(ECA)网络,这显著增强了模型从空间姿态分布中提取特征的能力。此外,该模型将平均池化层集成到空间金字塔池(SPP)网络中,以支持跌倒姿势的多尺度提取。同时,通过将ECA网络纳入SPP,该模型有效地结合了全局和局部特征,进一步增强了特征提取。本文在公共数据集上验证了SCPE-YOLOv5,证明它达到了88.29%的平均精度,表现优于你只看一次版本5小4.87%。此外,该模型实现每秒57.4帧。因此,SCPE-YOLOv5s为跌倒事件检测提供了一种新颖的解决方案。
    Addressing the critical need for accurate fall event detection due to their potentially severe impacts, this paper introduces the Spatial Channel and Pooling Enhanced You Only Look Once version 5 small (SCPE-YOLOv5s) model. Fall events pose a challenge for detection due to their varying scales and subtle pose features. To address this problem, SCPE-YOLOv5s introduces spatial attention to the Efficient Channel Attention (ECA) network, which significantly enhances the model\'s ability to extract features from spatial pose distribution. Moreover, the model integrates average pooling layers into the Spatial Pyramid Pooling (SPP) network to support the multi-scale extraction of fall poses. Meanwhile, by incorporating the ECA network into SPP, the model effectively combines global and local features to further enhance the feature extraction. This paper validates the SCPE-YOLOv5s on a public dataset, demonstrating that it achieves a mean Average Precision of 88.29 %, outperforming the You Only Look Once version 5 small by 4.87 %. Additionally, the model achieves 57.4 frames per second. Therefore, SCPE-YOLOv5s provides a novel solution for fall event detection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阴道炎是女性常见疾病,复发率高。主要诊断方法是荧光显微镜检查,但是手动检查效率低下,可能导致错误检测或漏检。需要在显微图像中自动识别和定位细胞。对于阴道炎的诊断,线索细胞和毛滴虫是两个重要的指标,由于尺度和图像特征的不同,很难被检测到。本研究提出了一种具有超分辨率重建分支的多尺度感知YOLO(MSP-YOLO),以满足线索细胞和滴虫的检测要求。根据线索细胞和毛滴虫的尺度和图像特征,我们在检测网络中采用了超分辨率重建分支。该分支引导检测分支专注于细微的特征差异。同时,我们提出了一个基于注意力的特征融合模块,该模块注入了扩张卷积组。该模块使网络关注大目标线索单元的非中心特征,这有助于提高检测灵敏度。实验结果表明,所提出的检测网络MSP-YOLO可以在不损害特异性的情况下提高灵敏度。对于线索细胞和毛滴虫的检测,拟议网络的灵敏度分别为0.706和0.910,比基线模型高0.218和0.051。在这项研究中,超分辨率重建任务的特征用于指导网络有效地提取和处理图像特征。新提出的网络具有更高的灵敏度,这使得自动检测阴道炎成为可能。
    Vaginitis is a common disease among women and has a high recurrence rate. The primary diagnosis method is fluorescence microscopic inspection, but manual inspection is inefficient and can lead to false detection or missed detection. Automatic cell identification and localization in microscopic images are necessary. For vaginitis diagnosis, clue cells and trichomonas are two important indicators and are difficult to be detected because of the different scales and image characteristics. This study proposes a Multi-Scale Perceptual YOLO (MSP-YOLO) with super-resolution reconstruction branch to meet the detection requirements of clue cells and trichomonas. Based on the scales and image characteristics of clue cells and trichomonas, we employed a super-resolution reconstruction branch to the detection network. This branch guides the detection branch to focus on subtle feature differences. Simultaneously, we proposed an attention-based feature fusion module that is injected with dilated convolutional group. This module makes the network pay attention to the non-centered features of the large target clue cells, which contributes to the enhancement of detection sensitivity. Experimental results show that the proposed detection network MSP-YOLO can improve sensitivity without compromising specificity. For clue cell and trichomoniasis detection, the proposed network achieved sensitivities of 0.706 and 0.910, respectively, which were 0.218 and 0.051 higher than those of the baseline model. In this study, the characteristics of the super-resolution reconstruction task are used to guide the network to effectively extract and process image features. The novel proposed network has an increased sensitivity, which makes it possible to detect vaginitis automatically.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    作为铁路信号系统的三大户外部件之一,轨道电路在确保列车运行的安全性和效率方面起着重要作用。因此,当故障发生时,需要快速准确地发现故障原因并及时处理,避免影响列车运行效率和安全事故的发生。本文提出了一种基于多尺度注意力网络的故障诊断方法,它使用Gramian角场(GAF)将一维时间序列转换为二维图像,充分利用卷积网络在处理图像数据方面的优势。设计了一种新的特征融合训练结构来有效地训练模型,完全提取不同尺度的特征,通过空间注意力机制融合空间特征信息。最后,实验是使用真实的轨道电路故障数据集进行的,故障诊断准确率达到99.36%,与经典和最先进的模型相比,我们的模型表现出更好的性能。并通过消融实验验证了所设计模型中的各个模块都起着关键作用。
    As one of the three major outdoor components of the railroad signal system, the track circuit plays an important role in ensuring the safety and efficiency of train operation. Therefore, when a fault occurs, the cause of the fault needs to be found quickly and accurately and dealt with in a timely manner to avoid affecting the efficiency of train operation and the occurrence of safety accidents. This article proposes a fault diagnosis method based on multi-scale attention network, which uses Gramian Angular Field (GAF) to transform one-dimensional time series into two-dimensional images, making full use of the advantages of convolutional networks in processing image data. A new feature fusion training structure is designed to effectively train the model, fully extract features at different scales, and fusing spatial feature information through spatial attention mechanisms. Finally, experiments are conducted using real track circuit fault datasets, and the accuracy of fault diagnosis reaches 99.36%, and our model demonstrates better performance compared to classical and state-of-the-art models. And the ablation experiments verified that each module in the designed model plays a key role.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    快速,对有缺陷的玉米粒进行有效和无损的检测对于其在粮仓中的高质量存储至关重要。提出了基于光谱和空间注意的高光谱成像(HSI)与卷积神经网络(CNN)相结合的方法(Spl-Spal-At)模块,用于不同类型玉米籽粒的识别。HSI数据在380-1000纳米范围内的六类发芽,热损伤,昆虫受损,发霉,收集破碎和健康的内核。CNN-Spl-At,基于光谱建立了CNN-Spal-At和CNN-Spl-Spal-At模型,图像及其融合特征作为识别不同内核的输入。通过支持向量机(SVM)和极限学习机(ELM)建立了模型,进一步比较了所提出模型和常规模型的性能。结果表明,注意序列模型对CNN的识别能力明显优于SVM和ELM模型,融合特征比单一特征更有利于表达不同核的外观。而CNN-Spl-Spal-At模型对训练集和测试集的平均分类准确率分别为98.04%和94.56%,分别。识别结果直观地呈现在具有不同颜色的内核的表面图像上。本研究建立的CNN-Spl-Spal-At模型可以有效检测缺陷玉米粒,它也有很大的潜力,为开发基于HSI技术的玉米品质无损检测设备提供分析方法。
    Rapid, effective and non-destructive detection of the defective maize kernels is crucial for their high-quality storage in granary. Hyperspectral imaging (HSI) coupled with convolutional neural network (CNN) based on spectral and spatial attention (Spl-Spal-At) module was proposed for identifying the different types of maize kernels. The HSI data within 380-1000 nm of six classes of sprouted, heat-damaged, insect-damaged, moldy, broken and healthy kernels was collected. The CNN-Spl-At, CNN-Spal-At and CNN-Spl-Spal-At models were established based on the spectra, images and their fusion features as inputs for the recognition of different kernels. Further compared the performances of proposed models and conventional models were built by support vector machine (SVM) and extreme learning machine (ELM). The results indicated that the recognition ability of CNN with attention series models was significantly better than that of SVM and ELM models and fused features were more conducive to expressing the appearance of different kernels than single features. And the CNN-Spl-Spal-At model had an optimal recognition result with high average classification accuracy of 98.04 % and 94.56 % for the training and testing sets, respectively. The recognition results were visually presented on the surface image of kernels with different colors. The CNN-Spl-Spal-At model was built in this study could effectively detect defective maize kernels, and it also had great potential to provide the analysis approaches for the development of non-destructive testing equipment based on HSI technique for maize quality.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在输入图像的局部窗口内执行卷积操作。因此,卷积神经网络(CNN)擅长获取局部信息。同时,自注意(SA)机制通过计算图像中所有位置的标记之间的相关性来提取特征,这在获取全球信息方面具有优势。因此,这两个模块可以相互补充,提高特征提取能力。一种有效的融合方法是一个值得深入研究的问题。在本文中,我们提出了以U-Net为骨干的CNN和SA并行网络CSAP-UNet。编码器由CNN和Transformer两个并行分支组成,用于从输入图像中提取特征,它考虑了全局依赖关系和本地信息。因为医学图像来自频谱中的某些频带,它们的颜色通道不像自然图像那么均匀。同时,医学分割更加关注图像中的病变区域。注意力融合模块(AFM)将通道注意力和空间注意力串联起来,融合两个分支的输出特征。医学图像分割任务实质上是定位图像中对象的边界。边界增强模块(BEM)设计在所提出的网络的浅层中,以更具体地关注像素级边缘细节。在三个公共数据集上的实验结果验证了CSAP-UNet优于最先进的网络,特别是在ISIC2017数据集上。在Kvasir和CVC-ClinicDB上的跨数据集评估表明,CSAP-UNet具有很强的泛化能力。消融实验也表明了所设计模块的有效性。培训和测试代码可在https://github.com/zhouzhou1201/CSAP-UNet获得。git.
    Convolution operation is performed within a local window of the input image. Therefore, convolutional neural network (CNN) is skilled in obtaining local information. Meanwhile, the self-attention (SA) mechanism extracts features by calculating the correlation between tokens from all positions in the image, which has advantage in obtaining global information. Therefore, the two modules can complement each other to improve feature extraction ability. An effective fusion method is a problem worthy of further study. In this paper, we propose a CNN and SA paralleling network CSAP-UNet with U-Net as backbone. The encoder consists of two parallel branches of CNN and Transformer to extract the feature from the input image, which takes into account both the global dependencies and the local information. Because medical images come from certain frequency bands within the spectrum, their color channels are not as uniform as natural images. Meanwhile, medical segmentation pays more attention to lesion regions in the image. Attention fusion module (AFM) integrates channel attention and spatial attention in series to fuse the output features of the two branches. The medical image segmentation task is essentially to locate the boundary of the object in the image. The boundary enhancement module (BEM) is designed in the shallow layer of the proposed network to focus more specifically on pixel-level edge details. Experimental results on three public datasets validate that CSAP-UNet outperforms state-of-the-art networks, particularly on the ISIC 2017 dataset. The cross-dataset evaluation on Kvasir and CVC-ClinicDB shows that CSAP-UNet has strong generalization ability. Ablation experiments also indicate the effectiveness of the designed modules. The code for training and test is available at https://github.com/zhouzhou1201/CSAP-UNet.git.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    高光谱图像(HSI)分类是一项极具挑战性的任务,特别是在作物产量预测和农业基础设施检测等领域。这些应用程序通常涉及复杂的图像类型,比如土壤,植被,水体,和城市结构,包含各种表面特征。在HSI,相邻波段之间的强相关性导致光谱信息冗余,而使用图像块作为分类的基本单位会导致空间信息的冗余。为了更有效地从这种大量冗余中提取关键信息进行分类,我们创新性地提出了CESA-MCFormer模型,通过引入中心增强空间注意力(CESA)模块和形态卷积(MC)来构建变压器架构。CESA模块结合了硬编码和软编码,在混合空间特征之前为模型提供先验空间信息,引入全面的空间信息。MC采用了一系列可学习的池化操作,不仅提取空间和光谱维度的关键细节,而且有效地合并这些信息。通过集成CESA模块和MC,CESA-MCFormer模型采用了“选择-提取”特征处理策略,使其能够以最少的样本实现精确的分类,而不依赖于PCA等降维技术。为了彻底评估我们的方法,我们对IP进行了广泛的实验,UP,和Chikusei数据集,将我们的方法与最新的先进方法进行比较。实验结果表明,CESA-MCFormer在所有三个测试数据集上都取得了出色的性能,Kappa系数为96.38%,98.24%,99.53%,分别。
    Hyperspectral image (HSI) classification is a highly challenging task, particularly in fields like crop yield prediction and agricultural infrastructure detection. These applications often involve complex image types, such as soil, vegetation, water bodies, and urban structures, encompassing a variety of surface features. In HSI, the strong correlation between adjacent bands leads to redundancy in spectral information, while using image patches as the basic unit of classification causes redundancy in spatial information. To more effectively extract key information from this massive redundancy for classification, we innovatively proposed the CESA-MCFormer model, building upon the transformer architecture with the introduction of the Center Enhanced Spatial Attention (CESA) module and Morphological Convolution (MC). The CESA module combines hard coding and soft coding to provide the model with prior spatial information before the mixing of spatial features, introducing comprehensive spatial information. MC employs a series of learnable pooling operations, not only extracting key details in both spatial and spectral dimensions but also effectively merging this information. By integrating the CESA module and MC, the CESA-MCFormer model employs a \"Selection-Extraction\" feature processing strategy, enabling it to achieve precise classification with minimal samples, without relying on dimension reduction techniques such as PCA. To thoroughly evaluate our method, we conducted extensive experiments on the IP, UP, and Chikusei datasets, comparing our method with the latest advanced approaches. The experimental results demonstrate that the CESA-MCFormer achieved outstanding performance on all three test datasets, with Kappa coefficients of 96.38%, 98.24%, and 99.53%, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号