deep learning models

  • 文章类型: Journal Article
    脑肿瘤是由于异常细胞组织的扩张而发生的,可以是恶性的(癌性的)或良性的(非癌性的)。位置等众多因素,尺寸,在检测和诊断脑肿瘤时考虑进展率。在初始阶段检测脑肿瘤对于MRI(磁共振成像)扫描起着重要作用的诊断至关重要。多年来,深度学习模型已被广泛用于医学图像处理。目前的研究主要调查了新颖的微调视觉变换器模型(FTVT)-FTVT-b16,FTVT-b32,FTVT-l16,FTVT-l32-用于脑肿瘤分类,同时还将它们与其他已建立的深度学习模型进行比较,例如ResNet50、MobileNet-V2和EfficientNet-B0。包含7,023张图像(MRI扫描)的数据集分为四个不同的类别,即,神经胶质瘤,脑膜瘤,垂体,并且没有肿瘤用于分类。Further,该研究对这些模型进行了比较分析,包括它们的准确性和其他评估指标,包括召回,精度,每个班级的F1得分。深度学习模型ResNet-50、EfficientNet-B0和MobileNet-V2的准确率为96.5%,95.1%,94.9%,分别。在所有的FTVT模型中,FTVT-l16模型取得了98.70%的显著精度,而其他FTVT-b16、FTVT-b32和FTVT-132模型取得了98.09%的精度,96.87%,98.62%,分别,从而证明了FTVT在医学图像处理中的有效性和鲁棒性。
    Brain tumors occur due to the expansion of abnormal cell tissues and can be malignant (cancerous) or benign (not cancerous). Numerous factors such as the position, size, and progression rate are considered while detecting and diagnosing brain tumors. Detecting brain tumors in their initial phases is vital for diagnosis where MRI (magnetic resonance imaging) scans play an important role. Over the years, deep learning models have been extensively used for medical image processing. The current study primarily investigates the novel Fine-Tuned Vision Transformer models (FTVTs)-FTVT-b16, FTVT-b32, FTVT-l16, FTVT-l32-for brain tumor classification, while also comparing them with other established deep learning models such as ResNet50, MobileNet-V2, and EfficientNet - B0. A dataset with 7,023 images (MRI scans) categorized into four different classes, namely, glioma, meningioma, pituitary, and no tumor are used for classification. Further, the study presents a comparative analysis of these models including their accuracies and other evaluation metrics including recall, precision, and F1-score across each class. The deep learning models ResNet-50, EfficientNet-B0, and MobileNet-V2 obtained an accuracy of 96.5%, 95.1%, and 94.9%, respectively. Among all the FTVT models, FTVT-l16 model achieved a remarkable accuracy of 98.70% whereas other FTVT models FTVT-b16, FTVT-b32, and FTVT-132 achieved an accuracy of 98.09%, 96.87%, 98.62%, respectively, hence proving the efficacy and robustness of FTVT\'s in medical image processing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    矿井水流入造成的灾害极大地威胁着煤矿开采作业的安全。深部开采使水文地质参数的获取复杂化,涌水的机制,以及矿井涌水量突变的预测。传统模型和单一机器学习方法通常无法准确预测矿井涌水量的突然变化。本研究引入了一种新颖的耦合分解-优化-深度学习模型,该模型集成了完整的经验模态分解与自适应噪声(CEEMDAN),北方苍鹰优化(NGO),和长短期记忆(LSTM)网络。我们评估了三种类型的矿井涌水量预测方法:奇异时间序列预测模型,分解-预测耦合模型,和分解-优化-预测耦合模型,评估他们捕捉数据趋势突然变化的能力及其预测准确性。结果表明,奇异预测模型是最优的,滑动输入步长为3,最大周期为400。与CEEMDAN-LSTM模型相比,CEEMDAN-NGO-LSTM模型在预测矿井涌水量的局部极端变化方面表现优异。具体来说,CEEMDAN-NGO-LSTM模型在MAE中获得96.578分,1.471%的MAPE,122.143inRMSE,和0.958的NSE,与LSTM模型和CEEMDAN-LSTM模型相比,平均性能提高了44.950%和19.400%,分别。此外,该模型提供了未来五天矿井涌水量的最准确预测。因此,分解-优化-预测耦合模型为智能矿山的安全监控提供了一种新颖的技术解决方案,为确保安全采矿作业提供了重要的理论和实践价值。
    Disasters caused by mine water inflows significantly threaten the safety of coal mining operations. Deep mining complicates the acquisition of hydrogeological parameters, the mechanics of water inrush, and the prediction of sudden changes in mine water inflow. Traditional models and singular machine learning approaches often fail to accurately forecast abrupt shifts in mine water inflows. This study introduces a novel coupled decomposition-optimization-deep learning model that integrates Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), Northern Goshawk Optimization (NGO), and Long Short-Term Memory (LSTM) networks. We evaluate three types of mine water inflow forecasting methods: a singular time series prediction model, a decomposition-prediction coupled model, and a decomposition-optimization-prediction coupled model, assessing their ability to capture sudden changes in data trends and their prediction accuracy. Results show that the singular prediction model is optimal with a sliding input step of 3 and a maximum of 400 epochs. Compared to the CEEMDAN-LSTM model, the CEEMDAN-NGO-LSTM model demonstrates superior performance in predicting local extreme shifts in mine water inflow volumes. Specifically, the CEEMDAN-NGO-LSTM model achieves scores of 96.578 in MAE, 1.471% in MAPE, 122.143 in RMSE, and 0.958 in NSE, representing average performance improvements of 44.950% and 19.400% over the LSTM model and CEEMDAN-LSTM model, respectively. Additionally, this model provides the most accurate predictions of mine water inflow volumes over the next five days. Therefore, the decomposition-optimization-prediction coupled model presents a novel technical solution for the safety monitoring of smart mines, offering significant theoretical and practical value for ensuring safe mining operations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在称为急性淋巴细胞白血病(ALL)的恶性肿瘤中,骨髓过度产生未成熟细胞。在美国,每年在儿童和成人中诊断出大约6500例ALL,占儿科癌症病例的近25%。最近,许多计算机辅助诊断(CAD)系统已被提出来帮助血液学家减少工作量,提供正确的结果,管理大量的数据。传统的CAD系统依赖于血液学家的专业知识,专业功能,和学科知识。利用ALL的早期检测可以帮助放射科医生和医生做出医疗决策。在这项研究中,提出了深度扩张剩余卷积神经网络(DDRNet)用于血细胞图像的分类,专注于嗜酸性粒细胞,淋巴细胞,单核细胞,和中性粒细胞。为了应对梯度消失和增强特征提取等挑战,该模型采用了深度剩余膨胀块(DRDB),以加快收敛速度。传统的残差块策略性地放置在层之间,以保留原始信息并提取一般特征图。全局和局部特征增强块(GLFEB)平衡来自浅层的弱贡献,以改进特征归一化。来自初始卷积层的全局特征,当与GLFEB处理的特征结合时,加强分类表示。Tanh函数引入非线性。通道和空间注意块(CSAB)被集成到神经网络中,以强调或最小化特定的特征通道,而完全连接的图层转换数据。乙状结肠激活函数的使用集中在多类淋巴细胞白血病分类的相关特征上。使用分为四类的Kaggle数据集(16,249张图像)对模型进行了分析,训练和测试比例为80:20。实验结果表明,DRDB,GLFEB和CSAB块的特征辨别能力将DDRNet模型F1分数提高到0.96,并且对于训练和测试数据具有最小的计算复杂度和99.86%和91.98%的最佳分类精度。DDRNet模型由于其91.98%的高测试精度而从现有方法中脱颖而出,F1得分为0.96,计算复杂度最低,增强了特征辨别能力。这些区块的战略组合(DRDB,GLFEB,和CSAB)旨在解决分类过程中的具体挑战,导致改进的特征区分对于准确的多类血细胞图像识别至关重要。它们在模型中的有效集成有助于DDRNet的卓越性能。
    The bone marrow overproduces immature cells in the malignancy known as Acute Lymphoblastic Leukemia (ALL). In the United States, about 6500 occurrences of ALL are diagnosed each year in both children and adults, comprising nearly 25% of pediatric cancer cases. Recently, many computer-assisted diagnosis (CAD) systems have been proposed to aid hematologists in reducing workload, providing correct results, and managing enormous volumes of data. Traditional CAD systems rely on hematologists\' expertise, specialized features, and subject knowledge. Utilizing early detection of ALL can aid radiologists and doctors in making medical decisions. In this study, Deep Dilated Residual Convolutional Neural Network (DDRNet) is presented for the classification of blood cell images, focusing on eosinophils, lymphocytes, monocytes, and neutrophils. To tackle challenges like vanishing gradients and enhance feature extraction, the model incorporates Deep Residual Dilated Blocks (DRDB) for faster convergence. Conventional residual blocks are strategically placed between layers to preserve original information and extract general feature maps. Global and Local Feature Enhancement Blocks (GLFEB) balance weak contributions from shallow layers for improved feature normalization. The global feature from the initial convolution layer, when combined with GLFEB-processed features, reinforces classification representations. The Tanh function introduces non-linearity. A Channel and Spatial Attention Block (CSAB) is integrated into the neural network to emphasize or minimize specific feature channels, while fully connected layers transform the data. The use of a sigmoid activation function concentrates on relevant features for multiclass lymphoblastic leukemia classification The model was analyzed with Kaggle dataset (16,249 images) categorized into four classes, with a training and testing ratio of 80:20. Experimental results showed that DRDB, GLFEB and CSAB blocks\' feature discrimination ability boosted the DDRNet model F1 score to 0.96 with minimal computational complexity and optimum classification accuracy of 99.86% and 91.98% for training and testing data. The DDRNet model stands out from existing methods due to its high testing accuracy of 91.98%, F1 score of 0.96, minimal computational complexity, and enhanced feature discrimination ability. The strategic combination of these blocks (DRDB, GLFEB, and CSAB) are designed to address specific challenges in the classification process, leading to improved discrimination of features crucial for accurate multi-class blood cell image identification. Their effective integration within the model contributes to the superior performance of DDRNet.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究概述了一种使用监控摄像头的方法和一种算法,该算法调用深度学习模型来生成以小流鲑鱼和鳟鱼为特征的视频片段。这种自动化过程大大减少了视频监控中人为干预的需求。此外,提供了有关设置和配置监视设备的全面指南,以及有关培训适合特定需求的深度学习模型的说明。访问有关深度学习模型的视频数据和知识使对鳟鱼和鲑鱼的监控变得动态和动手,因为收集的数据可用于训练和进一步改进深度学习模型。希望,这种设置将鼓励渔业管理人员进行更多的监测,因为与定制的鱼类监测解决方案相比,设备相对便宜。为了有效利用数据,相机捕获的鱼的自然标记可用于个人识别。虽然自动化过程大大减少了视频监控中人为干预的需求,并加快了鱼类的初始分类和检测速度,基于自然标记的人工识别单个鱼类仍然需要人工的努力和参与。个人遭遇数据拥有许多潜在的应用,如捕获-再捕获和相对丰度模型,并通过空间捕获来评估水力发电中的鱼类通道,也就是说,在不同位置识别的同一个人。使用这种技术可以获得很多收益,因为相机捕获是鱼的福利的更好选择,并且与物理捕获和标记相比耗时更少。
    This study outlines a method for using surveillance cameras and an algorithm that calls a deep learning model to generate video segments featuring salmon and trout in small streams. This automated process greatly reduces the need for human intervention in video surveillance. Furthermore, a comprehensive guide is provided on setting up and configuring surveillance equipment, along with instructions on training a deep learning model tailored to specific requirements. Access to video data and knowledge about deep learning models makes monitoring of trout and salmon dynamic and hands-on, as the collected data can be used to train and further improve deep learning models. Hopefully, this setup will encourage fisheries managers to conduct more monitoring as the equipment is relatively cheap compared with customized solutions for fish monitoring. To make effective use of the data, natural markings of the camera-captured fish can be used for individual identification. While the automated process greatly reduces the need for human intervention in video surveillance and speeds up the initial sorting and detection of fish, the manual identification of individual fish based on natural markings still requires human effort and involvement. Individual encounter data hold many potential applications, such as capture-recapture and relative abundance models, and for evaluating fish passages in streams with hydropower by spatial recaptures, that is, the same individual identified at different locations. There is much to gain by using this technique as camera captures are the better option for the fish\'s welfare and are less time-consuming compared with physical captures and tagging.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    临床决策支持系统(CDSS)是当代医疗保健中必不可少的工具,提高临床医生的决策和患者的预后。人工智能(AI)的集成现在正在进一步彻底改变CDSS。这篇综述深入探讨了人工智能技术转变CDSS,它们在医疗保健决策中的应用,相关挑战,以及充分发挥AI-CDSS潜力的潜在轨迹。审查首先为CDSS的定义及其在医疗保健领域的功能奠定了基础。然后强调了人工智能在提高CDSS有效性和效率方面发挥的日益重要的作用,强调其在塑造医疗保健实践方面不断发展的突出地位。它研究了将AI技术集成到CDSS中,包括神经网络和决策树等机器学习算法,自然语言处理,和深度学习。它还解决了与AI集成相关的挑战,比如可解释性和偏见。然后,我们转向CDSS中的AI应用程序,通过人工智能驱动诊断的真实例子,个性化治疗建议,风险预测,早期干预,和AI辅助的临床文档。该评论强调在AI-CDSS集成中以用户为中心的设计,解决可用性,信任,工作流,以及道德和法律方面的考虑。它承认普遍存在的障碍,并提出了成功采用AI-CDSS的策略,强调工作流程调整和跨学科协作的必要性。审查最后总结了主要发现,强调AI在CDSS中的变革潜力,并倡导继续研究和创新。它强调需要共同努力,以实现未来的AI驱动的CDSS优化医疗保健服务并改善患者预后。
    Clinical Decision Support Systems (CDSS) are essential tools in contemporary healthcare, enhancing clinicians\' decisions and patient outcomes. The integration of artificial intelligence (AI) is now revolutionizing CDSS even further. This review delves into AI technologies transforming CDSS, their applications in healthcare decision-making, associated challenges, and the potential trajectory toward fully realizing AI-CDSS\'s potential. The review begins by laying the groundwork with a definition of CDSS and its function within the healthcare field. It then highlights the increasingly significant role that AI is playing in enhancing CDSS effectiveness and efficiency, underlining its evolving prominence in shaping healthcare practices. It examines the integration of AI technologies into CDSS, including machine learning algorithms like neural networks and decision trees, natural language processing, and deep learning. It also addresses the challenges associated with AI integration, such as interpretability and bias. We then shift to AI applications within CDSS, with real-life examples of AI-driven diagnostics, personalized treatment recommendations, risk prediction, early intervention, and AI-assisted clinical documentation. The review emphasizes user-centered design in AI-CDSS integration, addressing usability, trust, workflow, and ethical and legal considerations. It acknowledges prevailing obstacles and suggests strategies for successful AI-CDSS adoption, highlighting the need for workflow alignment and interdisciplinary collaboration. The review concludes by summarizing key findings, underscoring AI\'s transformative potential in CDSS, and advocating for continued research and innovation. It emphasizes the need for collaborative efforts to realize a future where AI-powered CDSS optimizes healthcare delivery and improves patient outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    老年人跌倒是一个主要的威胁,每年导致150-200万老年人遭受严重伤害和100万人死亡。老年人遭受的跌倒可能会对他们的身心健康状况产生长期的负面影响。最近,主要的医疗保健研究集中在这一点上,以检测和防止跌倒。在这项工作中,设计并开发了一种基于人工智能(AI)边缘计算的可穿戴设备,用于检测和预防老年人跌倒。Further,各种深度学习算法,如卷积神经网络(CNN),循环神经网络(RNN)长短期记忆(LSTM)门控递归单元(GRU)用于老年人的活动识别。此外,CNN-LSTM,分别利用具有和不具有关注层的RNN-LSTM和GRU-LSTM,并分析性能指标以找到最佳的深度学习模型。此外,三个不同的硬件板,如JetsonNano开发板,树莓PI3和4被用作AI边缘计算设备,并实现了最佳的深度学习模型并评估了计算时间。结果表明,具有注意层的CNN-LSTM具有准确性,召回,精度和F1分数为97%,98%,98%和0.98,与其他深度学习模型相比更好。此外,与其他边缘计算设备相比,NVIDIAJetsonNano的计算时间更短。这项工作似乎具有很高的社会相关性,因为所提出的可穿戴设备可以用于监测老年人的活动并防止老年人跌倒,从而改善老年人的生活质量。
    Elderly falls are a major concerning threat resulting in over 1.5-2 million elderly people experiencing severe injuries and 1 million deaths yearly. Falls experienced by Elderly people may lead to a long-term negative impact on their physical and psychological health conditions. Major healthcare research had focused on this lately to detect and prevent the fall. In this work, an Artificial Intelligence (AI) edge computing based wearable device is designed and developed for detection and prevention of fall of elderly people. Further, the various deep learning algorithms such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) are utilized for activity recognition of elderly. Also, the CNN-LSTM, RNN-LSTM and GRU-LSTM with and without attention layer respectively are utilized and the performance metrics are analyzed to find the best deep learning model. Furthermore, the three different hardware boards such as Jetson Nano developer board, Raspberry PI 3 and 4 are utilized as an AI edge computing device and the best deep learning model is implemented and the computation time is evaluated. Results demonstrate that the CNN-LSTM with attention layer exhibits the accuracy, recall, precision and F1_Score of 97%, 98%, 98% and 0.98 respectively which is better when compared to other deep learning models. Also, the computation time of NVIDIA Jetson Nano is less when compared to other edge computing devices. This work appears to be of high societal relevance since the proposed wearable device can be used to monitor the activity of elderly and prevents the elderly falls which improve the quality of life of elderly people.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们研究了不同数据模式对牛体重估计的影响。为此,我们收集并呈现我们自己的牛数据集,代表数据模态:RGB,深度,RGB和深度相结合,分割,并结合了分割和深度信息。我们探索了MetaAIResearch提出的基于视觉变换器的零拍模型,用于生成分割数据模态并从图像中提取仅牛区域。为了进行实验分析,我们考虑三个基线深度学习模型。目的是评估不同数据源的集成如何影响深度学习模型的准确性和鲁棒性,考虑四个不同的性能指标:平均绝对误差(MAE)。均方根误差(RMSE),平均绝对百分比误差(MAPE),和R的平方(R2)。我们探讨了与每种模式相关的协同作用和挑战,以及它们在提高牛体重预测精度方面的组合使用。通过综合试验和评价,我们的目标是提供对不同数据模式在提高已建立的深度学习模型的性能方面的有效性的见解,促进精准牲畜管理系统的知情决策。
    We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a recent vision-transformer-based zero-shot model proposed by Meta AI Research for producing the segmentation data modality and for extracting the cattle-only region from the images. For experimental analysis, we consider three baseline deep learning models. The objective is to assess how the integration of diverse data sources influences the accuracy and robustness of the deep learning models considering four different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and R-squared (R2). We explore the synergies and challenges associated with each modality and their combined use in enhancing the precision of cattle weight prediction. Through comprehensive experimentation and evaluation, we aim to provide insights into the effectiveness of different data modalities in improving the performance of established deep learning models, facilitating informed decision-making for precision livestock management systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Multicenter Study
    背景:慢性阻塞性肺疾病(COPD)在当前的金标准测量肺功能测试(PFT)中被低估。对于COPD的早期检测和严重程度评估,一个更敏感和简单的选择可以使医生和患者受益。
    方法:在这项多中心回顾性研究中,收集和处理1055名参与者的正面胸部X线(CXR)图像和相关临床信息。对不同的深度学习算法和迁移学习模型进行了训练,以根据666名受试者的临床数据和CXR图像对COPD进行分类,并在基于284名参与者的内部测试集中进行了验证。还进行了包括105名参与者的外部测试,以验证学习算法在诊断COPD中的泛化能力。同时,该模型被进一步用于通过预测不同分级来评估COPD的疾病严重程度.
    结果:Ensemble模型通过在内部测试中同时提取临床参数和CXR图像的融合特征,在区分COPD方面显示AUC为0.969,优于仅使用临床参数(AUC=0.963)或图像(AUC=0.946)的模型。对于外部测试集,在根据临床参数和CXR图像预测COPD时,AUC略微下降至0.934.当应用Ensemble模型来确定COPD的疾病严重程度时,三分类和五分类的AUC分别达到0.894和0.852。
    结论:本研究使用DL算法筛查COPD并根据CXR成像和临床参数预测疾病严重程度。模型表现出良好的性能,该方法可能是一种有效的病例发现工具,辐射剂量低,用于COPD的诊断和分期。
    BACKGROUND: Chronic obstructive pulmonary disease (COPD) is underdiagnosed with the current gold standard measure pulmonary function test (PFT). A more sensitive and simple option for early detection and severity evaluation of COPD could benefit practitioners and patients.
    METHODS: In this multicenter retrospective study, frontal chest X-ray (CXR) images and related clinical information of 1055 participants were collected and processed. Different deep learning algorithms and transfer learning models were trained to classify COPD based on clinical data and CXR images from 666 subjects, and validated in internal test set based on 284 participants. External test including 105 participants was also performed to verify the generalization ability of the learning algorithms in diagnosing COPD. Meanwhile, the model was further used to evaluate disease severity of COPD by predicting different grads.
    RESULTS: The Ensemble model showed an AUC of 0.969 in distinguishing COPD by simultaneously extracting fusion features of clinical parameters and CXR images in internal test, better than models that used clinical parameters (AUC = 0.963) or images (AUC = 0.946) only. For the external test set, the AUC slightly declined to 0.934 in predicting COPD based on clinical parameters and CXR images. When applying the Ensemble model to determine disease severity of COPD, the AUC reached 0.894 for three-classification and 0.852 for five-classification respectively.
    CONCLUSIONS: The present study used DL algorithms to screen COPD and predict disease severity based on CXR imaging and clinical parameters. The models showed good performance and the approach might be an effective case-finding tool with low radiation dose for COPD diagnosis and staging.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:基于深度学习的自动分割算法可以通过定义准确的感兴趣区域来改善临床工作流程,同时减少体力劳动。在过去的十年里,卷积神经网络(CNN)在医学图像分割应用中已经变得突出。然而,由于卷积层的局部性,CNN在学习远程空间依赖性方面具有局限性。引入变形金刚来应对这一挑战。在具有自我注意机制的变压器中,即使是第一层的信息处理也会在遥远的图像位置之间建立连接。我们的论文提出了一个新颖的框架,将这两种独特的技术连接起来,CNN和变压器,在非小细胞肺癌(NSCLC)患者的计算机断层扫描(CT)图像中准确有效地分割肿瘤体积(GTV)。
    方法:在此框架下,多分辨率图像的输入与多深度主干一起使用,以保留深度学习架构中高分辨率和低分辨率图像的优势。此外,使用可变形变压器来学习对提取特征的长期依赖性。为了降低计算复杂性并有效地处理多尺度,多深度,高分辨率3D图像,这种变压器注重小的关键位置,这是通过自我注意机制确定的。我们评估了所提出的框架在包含563个训练图像和113个测试图像的NSCLC数据集上的性能。我们新颖的深度学习算法以其他五个类似的深度学习模型为基准。
    结果:实验结果表明,我们提出的框架优于其他基于CNN的框架,基于变压器,和混合方法在骰子得分(0.92)和豪斯多夫距离(1.33)方面。因此,我们提出的模型可能会提高临床工作流程中早期NSCLC自动分割的效率.这种类型的框架可能会促进在线自适应放射治疗,其中需要有效的自动分段工作流程。
    结论:我们的深度学习框架,基于CNN和变压器,有效地执行自动分割,并可能有助于临床放射治疗工作流程。
    OBJECTIVE: Deep learning-based auto-segmentation algorithms can improve clinical workflow by defining accurate regions of interest while reducing manual labor. Over the past decade, convolutional neural networks (CNNs) have become prominent in medical image segmentation applications. However, CNNs have limitations in learning long-range spatial dependencies due to the locality of the convolutional layers. Transformers were introduced to address this challenge. In transformers with self-attention mechanism, even the first layer of information processing makes connections between distant image locations. Our paper presents a novel framework that bridges these two unique techniques, CNNs and transformers, to segment the gross tumor volume (GTV) accurately and efficiently in computed tomography (CT) images of non-small cell-lung cancer (NSCLC) patients.
    METHODS: Under this framework, input of multiple resolution images was used with multi-depth backbones to retain the benefits of high-resolution and low-resolution images in the deep learning architecture. Furthermore, a deformable transformer was utilized to learn the long-range dependency on the extracted features. To reduce computational complexity and to efficiently process multi-scale, multi-depth, high-resolution 3D images, this transformer pays attention to small key positions, which were identified by a self-attention mechanism. We evaluated the performance of the proposed framework on a NSCLC dataset which contains 563 training images and 113 test images. Our novel deep learning algorithm was benchmarked against five other similar deep learning models.
    RESULTS: The experimental results indicate that our proposed framework outperforms other CNN-based, transformer-based, and hybrid methods in terms of Dice score (0.92) and Hausdorff Distance (1.33). Therefore, our proposed model could potentially improve the efficiency of auto-segmentation of early-stage NSCLC during the clinical workflow. This type of framework may potentially facilitate online adaptive radiotherapy, where an efficient auto-segmentation workflow is required.
    CONCLUSIONS: Our deep learning framework, based on CNN and transformer, performs auto-segmentation efficiently and could potentially assist clinical radiotherapy workflow.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    手语被设计为在聋人社区之间传达信息的自然交流方法。在通过可穿戴传感器进行手语识别的研究中,数据源有限,数据采集过程复杂。本研究旨在通过可穿戴惯性运动捕捉系统收集美国手语数据集,并利用深度学习模型实现手语句子的识别和端到端翻译。在这项工作中,由300个常用句子组成的数据集来自3名志愿者。在识别网络的设计中,该模型主要由三层组成:卷积神经网络,双向长短期记忆,和连接主义时间分类。该模型在单词级评估中的准确率为99.07%,在句子级评估中的准确率为97.34%。在翻译网络的设计中,编码器-解码器结构化模型主要基于具有全球关注的长短期记忆。端到端翻译的单词错误率为16.63%。所提出的方法具有利用来自设备的可靠惯性数据来识别更多手语句子的潜力。
    Sign language is designed as a natural communication method to convey messages among the deaf community. In the study of sign language recognition through wearable sensors, the data sources are limited, and the data acquisition process is complex. This research aims to collect an American sign language dataset with a wearable inertial motion capture system and realize the recognition and end-to-end translation of sign language sentences with deep learning models. In this work, a dataset consisting of 300 commonly used sentences is gathered from 3 volunteers. In the design of the recognition network, the model mainly consists of three layers: convolutional neural network, bi-directional long short-term memory, and connectionist temporal classification. The model achieves accuracy rates of 99.07% in word-level evaluation and 97.34% in sentence-level evaluation. In the design of the translation network, the encoder-decoder structured model is mainly based on long short-term memory with global attention. The word error rate of end-to-end translation is 16.63%. The proposed method has the potential to recognize more sign language sentences with reliable inertial data from the device.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号