Multimodal deep learning

多模态深度学习
  • 文章类型: Journal Article
    背景:临床笔记包含与患者过去和当前健康状况相关的结构化数据之外的上下文信息。
    目的:本研究旨在设计一种多模态深度学习方法,以提高使用入院临床记录和易于收集的表格数据对心力衰竭(HF)的医院结局的评估精度。
    方法:多模态模型的开发和验证数据来自3个开放获取的美国数据库,包括2001年至2019年从教学医院收集的重症监护医学信息集市IIIv1.4(MIMIC-III)和MIMIC-IVv1.0,以及2014年至2015年从208家医院收集的eICU协作研究数据库v1.2。研究队列由所有患有严重HF的患者组成。临床笔记,包括主诉,目前的病史,体检,病史,和入院药物,以及记录在电子健康记录中的临床变量,进行了分析。我们开发了一种针对住院患者的深度学习死亡率预测模型,经历了完整的内部,prospective,和外部评估。采用综合梯度法和SHapley加法扩张法(SHAP)分析危险因素的重要性。
    结果:该研究包括发展集中的9989名(16.4%)患者,内部验证集中的2497(14.1%)名患者,预期验证集中为1896(18.3%),和外部验证组中的7432名(15%)患者。模型的受试者工作特征曲线下面积为0.838(95%CI0.827-0.851),0.849(95%CI0.841-0.856),和0.767(95%CI0.762-0.772),对于内部,prospective,和外部验证集,分别。在所有测试集中,多峰模型的接收器工作特性曲线下的面积优于单峰模型,和表格数据导致了更高的歧视。在早期评估中,病史和体格检查比其他因素更有用。
    结论:合并入院记录和临床表格数据的多模式深度学习模型显示,作为评估HF患者死亡风险的潜在新方法,具有良好的疗效。提供更准确、更及时的决策支持。
    BACKGROUND: Clinical notes contain contextualized information beyond structured data related to patients\' past and current health status.
    OBJECTIVE: This study aimed to design a multimodal deep learning approach to improve the evaluation precision of hospital outcomes for heart failure (HF) using admission clinical notes and easily collected tabular data.
    METHODS: Data for the development and validation of the multimodal model were retrospectively derived from 3 open-access US databases, including the Medical Information Mart for Intensive Care III v1.4 (MIMIC-III) and MIMIC-IV v1.0, collected from a teaching hospital from 2001 to 2019, and the eICU Collaborative Research Database v1.2, collected from 208 hospitals from 2014 to 2015. The study cohorts consisted of all patients with critical HF. The clinical notes, including chief complaint, history of present illness, physical examination, medical history, and admission medication, as well as clinical variables recorded in electronic health records, were analyzed. We developed a deep learning mortality prediction model for in-hospital patients, which underwent complete internal, prospective, and external evaluation. The Integrated Gradients and SHapley Additive exPlanations (SHAP) methods were used to analyze the importance of risk factors.
    RESULTS: The study included 9989 (16.4%) patients in the development set, 2497 (14.1%) patients in the internal validation set, 1896 (18.3%) in the prospective validation set, and 7432 (15%) patients in the external validation set. The area under the receiver operating characteristic curve of the models was 0.838 (95% CI 0.827-0.851), 0.849 (95% CI 0.841-0.856), and 0.767 (95% CI 0.762-0.772), for the internal, prospective, and external validation sets, respectively. The area under the receiver operating characteristic curve of the multimodal model outperformed that of the unimodal models in all test sets, and tabular data contributed to higher discrimination. The medical history and physical examination were more useful than other factors in early assessments.
    CONCLUSIONS: The multimodal deep learning model for combining admission notes and clinical tabular data showed promising efficacy as a potentially novel method in evaluating the risk of mortality in patients with HF, providing more accurate and timely decision support.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    双相情感障碍(BD)的特征是反复发作的抑郁症和轻度躁狂症。在本文中,为了解决现有方法准确性不足的常见问题,满足临床诊断的要求,我们提出了一个称为时空特征融合变换器(STF2Former)的框架。通过引入时空特征聚合模块(STFAM)来学习rs-fMRI数据的时间和空间特征,它改进了我们以前的工作-MFFormer。它促进了跨不同模态的模态内注意力和信息融合。具体来说,该方法将时间维度和空间维度解耦,设计了两个特征提取模块,分别提取时间信息和空间信息。大量的实验证明了我们提出的STFAM在从rs-fMRI中提取特征的有效性,并证明我们的STF2Former可以显著优于MFFormer,并在其他最先进的方法中取得更好的结果。
    Bipolar disorder (BD) is characterized by recurrent episodes of depression and mild mania. In this paper, to address the common issue of insufficient accuracy in existing methods and meet the requirements of clinical diagnosis, we propose a framework called Spatio-temporal Feature Fusion Transformer (STF2Former). It improves on our previous work - MFFormer by introducing a Spatio-temporal Feature Aggregation Module (STFAM) to learn the temporal and spatial features of rs-fMRI data. It promotes intra-modality attention and information fusion across different modalities. Specifically, this method decouples the temporal and spatial dimensions and designs two feature extraction modules for extracting temporal and spatial information separately. Extensive experiments demonstrate the effectiveness of our proposed STFAM in extracting features from rs-fMRI, and prove that our STF2Former can significantly outperform MFFormer and achieve much better results among other state-of-the-art methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    为了模拟嗅觉的功能,gustation,愿景,和口头接触,智能感官技术已经发展起来。电子鼻顶空固相微萃取气相色谱-质谱(HS-SPME-GC/MS),电子舌头计算机视觉(CV),和质地分析仪(TA)用于各种烘烤方法的羔羊shashliks(LS)的感官表征。通过HS-SPME/GC-MS鉴定了5种烘烤方法的羔羊shashliks中的56种VOCs,基于OAV(>1),21种VOCs被确定为关键化合物。还提出了跨通道感觉转换(CCST),并将其用于预测19种感觉属性及其不同烘烤方法的羔羊shashlik得分。该模型在预测集中取得了令人满意的结果(R2=0.964)。这项研究表明,多模态深度学习模型可以用来模拟评估者,指导和正确的感官评价是可行的。
    To simulate the functions of olfaction, gustation, vision, and oral touch, intelligent sensory technologies have been developed. Headspace solid-phase microextraction gas chromatography-mass spectrometry (HS-SPME-GC/MS) with electronic noses (E-noses), electronic tongues (E-tongues), computer vision (CVs), and texture analyzers (TAs) was applied for sensory characterization of lamb shashliks (LSs) with various roasting methods. A total of 56 VOCs in lamb shashliks with five roasting methods were identified by HS-SPME/GC-MS, and 21 VOCs were identified as key compounds based on OAV (>1). Cross-channel sensory Transformer (CCST) was also proposed and used to predict 19 sensory attributes and their lamb shashlik scores with different roasting methods. The model achieved satisfactory results in the prediction set (R2 = 0.964). This study shows that a multimodal deep learning model can be used to simulate assessor, and it is feasible to guide and correct sensory evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    药物-药物相互作用(DDI)可能对患者的安全和健康产生重大影响。在向患者施用药物之前预测潜在的DDI是药物开发的关键步骤,可以帮助预防不良药物事件。在这项研究中,我们提出了一种称为HF-DDI的新方法,用于根据各种药物特征预测DDI事件,包括分子结构,目标,和酶信息。具体来说,我们设计了采用早期融合和晚期融合策略的模型,并利用分数计算模块来预测药物之间相互作用的可能性.我们的模型是在已知DDI的大数据集上训练和测试的,达到0.948的整体精度。结果表明,整合多种药物特征可以提高DDI事件预测的准确性,并可能有助于改善药物安全性和患者预后。
    Drug-drug interactions (DDIs) can have a significant impact on patient safety and health. Predicting potential DDIs before administering drugs to patients is a critical step in drug development and can help prevent adverse drug events. In this study, we propose a novel method called HF-DDI for predicting DDI events based on various drug features, including molecular structure, target, and enzyme information. Specifically, we design our model with both early fusion and late fusion strategies and utilize a score calculation module to predict the likelihood of interactions between drugs. Our model was trained and tested on a large data set of known DDIs, achieving an overall accuracy of 0.948. The results suggest that incorporating multiple drug features can improve the accuracy of DDI event prediction and may be useful for improving drug safety and patient outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    水在番茄(SolanumlycopersicumL.)的生长中起着非常重要的作用,而如何检测番茄的水分状况是精准灌溉的关键。本研究的目的是通过融合RGB来检测番茄的水分状况,通过深度学习的NIR和深度图像信息。设定了五个灌溉水平,以在不同的水状态下种植西红柿,灌溉量为150%,125%,100%,75%,通过修正的Penman-Monteith方程计算的参考蒸散量的50%,分别。西红柿的水分状况分为五类:严重灌溉赤字,略有灌溉赤字,适度灌溉,稍微过度灌溉,严重过度灌溉。RGB图像,取番茄植株上部的深度图像和近红外图像作为数据集。数据集用于训练和测试使用单模式和多模式深度学习网络构建的番茄水分状态检测模型,分别。在单模式深度学习网络中,两个CNN,VGG-16和Resnet-50在单个RGB图像上进行了训练,深度图像,或NIR图像共6例。在多模式深度学习网络中,两个或更多的RGB图像,深度图像和近红外图像分别用VGG-16或Resnet-50训练,共20种组合。结果表明,基于单模式深度学习的番茄水分状态检测准确率为88.97%~93.09%,而基于多模态深度学习的番茄水分状态检测准确率为93.09%~99.18%。多模态深度学习显著优于单模态深度学习。使用多模式深度学习网络建立的番茄水分状态检测模型具有RGB图像的ResNet-50和深度和NIR图像的VGG-16。该研究为番茄水分状态的无损检测提供了一种新方法,为精准灌溉管理提供了参考。
    Water plays a very important role in the growth of tomato (Solanum lycopersicum L.), and how to detect the water status of tomato is the key to precise irrigation. The objective of this study is to detect the water status of tomato by fusing RGB, NIR and depth image information through deep learning. Five irrigation levels were set to cultivate tomatoes in different water states, with irrigation amounts of 150%, 125%, 100%, 75%, and 50% of reference evapotranspiration calculated by a modified Penman-Monteith equation, respectively. The water status of tomatoes was divided into five categories: severely irrigated deficit, slightly irrigated deficit, moderately irrigated, slightly over-irrigated, and severely over-irrigated. RGB images, depth images and NIR images of the upper part of the tomato plant were taken as data sets. The data sets were used to train and test the tomato water status detection models built with single-mode and multimodal deep learning networks, respectively. In the single-mode deep learning network, two CNNs, VGG-16 and Resnet-50, were trained on a single RGB image, a depth image, or a NIR image for a total of six cases. In the multimodal deep learning network, two or more of the RGB images, depth images and NIR images were trained with VGG-16 or Resnet-50, respectively, for a total of 20 combinations. Results showed that the accuracy of tomato water status detection based on single-mode deep learning ranged from 88.97% to 93.09%, while the accuracy of tomato water status detection based on multimodal deep learning ranged from 93.09% to 99.18%. The multimodal deep learning significantly outperformed the single-modal deep learning. The tomato water status detection model built using a multimodal deep learning network with ResNet-50 for RGB images and VGG-16 for depth and NIR images was optimal. This study provides a novel method for non-destructive detection of water status of tomato and gives a reference for precise irrigation management.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    多模态深度学习模型已应用于疾病预测任务,但是由于子模型和融合模块之间的冲突,训练中存在困难。为了缓解这个问题,我们提出了一个解耦特征对齐和融合(DeAF)的框架,将多模态模型训练分为两个阶段。在第一阶段,进行无监督表示学习,并且模态自适应(MA)模块用于对准来自各种模态的特征。在第二阶段,自我注意力融合(SAF)模块使用监督学习将医学图像特征和临床数据相结合。此外,我们应用DeAF框架来预测CRS治疗结直肠癌的术后疗效以及MCI患者是否转变为阿尔茨海默病。与以前的方法相比,DeAF框架实现了显着改进。此外,进行了广泛的消融实验,以证明我们的框架的合理性和有效性。总之,我们的框架增强了局部医学图像特征和临床数据之间的相互作用,并得出更多判别多模态特征用于疾病预测。框架实现可在https://github.com/cchencan/DeAF获得。
    Multimodal deep learning models have been applied for disease prediction tasks, but difficulties exist in training due to the conflict between sub-models and fusion modules. To alleviate this issue, we propose a framework for decoupling feature alignment and fusion (DeAF), which separates the multimodal model training into two stages. In the first stage, unsupervised representation learning is conducted, and the modality adaptation (MA) module is used to align the features from various modalities. In the second stage, the self-attention fusion (SAF) module combines the medical image features and clinical data using supervised learning. Moreover, we apply the DeAF framework to predict the postoperative efficacy of CRS for colorectal cancer and whether the MCI patients change to Alzheimer\'s disease. The DeAF framework achieves a significant improvement in comparison to the previous methods. Furthermore, extensive ablation experiments are conducted to demonstrate the rationality and effectiveness of our framework. In conclusion, our framework enhances the interaction between the local medical image features and clinical data, and derive more discriminative multimodal features for disease prediction. The framework implementation is available at https://github.com/cchencan/DeAF.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:众所周知,口腔疾病如牙周(牙龈)疾病与各种全身性疾病和病症密切相关。深度学习的进步有可能为医疗保健做出重大贡献,特别是在依赖医学成像的领域。结合基于临床和实验室数据的非成像信息可以允许临床医生做出更全面和准确的决定。方法:这里,我们开发了一种多模式深度学习方法来预测口腔健康状况的系统性疾病和障碍。在第一阶段使用了双损失自动编码器,以从1188张全景X射线照片中提取与牙周病相关的特征。然后,在第二阶段,我们将图像特征与来自电子健康记录(EHR)的人口统计学数据和临床信息融合,以预测系统性疾病.我们使用接收器操作特性(ROC)和准确性来评估我们的模型。通过一个看不见的测试数据集进一步验证了该模型。调查结果:根据我们的调查结果,最准确预测的前三个章节,按顺序,是第三章,VI和IX。结果表明,该模型可以预测属于第三章的系统性疾病,VI和IX,AUC值为0.92(95%CI,0.90-94),0.87(95%CI,0.84-89)和0.78(95%CI,0.75-81),分别。为了评估模型的稳健性,我们对这些章节的未知测试数据集进行了评估,结果显示第三章的准确度为0.88、0.82和0.72,VI和IX,分别。解释:本研究表明,可以考虑将全景X射线照片和临床口腔特征相结合,以训练用于预测系统性疾病和障碍的融合深度学习模型。
    Background: It is known that oral diseases such as periodontal (gum) disease are closely linked to various systemic diseases and disorders. Deep learning advances have the potential to make major contributions to healthcare, particularly in the domains that rely on medical imaging. Incorporating non-imaging information based on clinical and laboratory data may allow clinicians to make more comprehensive and accurate decisions. Methods: Here, we developed a multimodal deep learning method to predict systemic diseases and disorders from oral health conditions. A dual-loss autoencoder was used in the first phase to extract periodontal disease-related features from 1188 panoramic radiographs. Then, in the second phase, we fused the image features with the demographic data and clinical information taken from electronic health records (EHR) to predict systemic diseases. We used receiver operation characteristics (ROC) and accuracy to evaluate our model. The model was further validated by an unseen test dataset. Findings: According to our findings, the top three most accurately predicted chapters, in order, are the Chapters III, VI and IX. The results indicated that the proposed model could predict systemic diseases belonging to Chapters III, VI and IX, with AUC values of 0.92 (95% CI, 0.90-94), 0.87 (95% CI, 0.84-89) and 0.78 (95% CI, 0.75-81), respectively. To assess the robustness of the models, we performed the evaluation on the unseen test dataset for these chapters and the results showed an accuracy of 0.88, 0.82 and 0.72 for Chapters III, VI and IX, respectively. Interpretation: The present study shows that the combination of panoramic radiograph and clinical oral features could be considered to train a fusion deep learning model for predicting systemic diseases and disorders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:建立中医体质分类的多模态深度学习模型,即,平衡和不平衡的宪法,基于舌头和面部图像的检查,来自触诊的脉搏波,以及总共540名受试者的健康信息。
    方法:本研究数据包括舌头和面部图像,通过触诊获得的脉搏波,和健康信息,包括个人信息,生活习惯,病史,和目前的症状,540名受试者(男性202名,女性338名)。卷积神经网络,递归神经网络,和完全连接的神经网络被用来从数据中提取深层特征。构建了多模态数据的特征融合和决策融合模型。
    结果:舌头和面部图像的最佳模型,脉搏波和健康信息是ResNet18,门循环股,和实体嵌入,分别。特征融合优于决策融合。多模态分析表明,多模态数据补偿了单个模态的信息损失,从而提高分类性能。
    结论:多模态数据融合可以补充单模型信息并提高分类性能。我们的研究强调了多模式深度学习技术在识别体质方面的有效性,以实现中药的现代化和智能应用。
    OBJECTIVE: To develop a multimodal deep-learning model for classifying Chinese medicine constitution, i.e., the balanced and unbalanced constitutions, based on inspection of tongue and face images, pulse waves from palpation, and health information from a total of 540 subjects.
    METHODS: This study data consisted of tongue and face images, pulse waves obtained by palpation, and health information, including personal information, life habits, medical history, and current symptoms, from 540 subjects (202 males and 338 females). Convolutional neural networks, recurrent neural networks, and fully connected neural networks were used to extract deep features from the data. Feature fusion and decision fusion models were constructed for the multimodal data.
    RESULTS: The optimal models for tongue and face images, pulse waves and health information were ResNet18, Gate Recurrent Unit, and entity embedding, respectively. Feature fusion was superior to decision fusion. The multimodal analysis revealed that multimodal data compensated for the loss of information from a single mode, resulting in improved classification performance.
    CONCLUSIONS: Multimodal data fusion can supplement single model information and improve classification performance. Our research underscores the effectiveness of multimodal deep learning technology to identify body constitution for modernizing and improving the intelligent application of Chinese medicine.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    微卫星不稳定性(MSI),免疫疗法和Lynch综合征诊断的重要生物标志物,指DNA复制过程中插入或缺失引起的微卫星(MS)序列长度的变化。然而,传统的基于湿法实验室实验的MSI检测耗时且依赖于实验条件。此外,尚未对MSI状态与mRNA和miRNA等各种分子之间的关联进行全面研究.在这项研究中,我们首先研究了MSI状态与包括mRNA在内的几种分子之间的关联,miRNA,lncRNA,DNA甲基化,使用来自癌症基因组图谱(TCGA)的结直肠癌数据和拷贝数变异(CNV)。然后,我们开发了一种新颖的深度学习框架,仅基于苏木精和曙红(H&E)染色图像来预测MSI状态,并通过多峰紧凑双线性池化将H&E图像与上述分子组合。我们的结果表明,mRNA存在显着差异,miRNA,高微卫星不稳定性(MSI-H)患者组和低微卫星不稳定性或微卫星稳定性(MSI-L/MSS)患者组之间的lncRNA。通过单独使用H&E图像,在5倍交叉验证中,可以预测MSI状态,曲线下可接受的预测面积(AUC)为0.809。将H&E图像与单一类型分子融合的融合模型比单独使用H&E图像具有更高的预测精度。当将H&E图像与DNA甲基化数据相结合时,获得的最高AUC为0.952。然而,当将H&E图像与所有类型的分子数据相结合时,预测精度会降低。总之,将H&E图像与深度学习相结合可以预测结直肠癌的MSI状态,其准确性可以通过整合适当的分子数据来进一步提高。这项研究在实践中可能具有临床意义。
    Microsatellite instability (MSI), an important biomarker for immunotherapy and the diagnosis of Lynch syndrome, refers to the change of microsatellite (MS) sequence length caused by insertion or deletion during DNA replication. However, traditional wet-lab experiment-based MSI detection is time-consuming and relies on experimental conditions. In addition, a comprehensive study on the associations between MSI status and various molecules like mRNA and miRNA has not been performed. In this study, we first studied the association between MSI status and several molecules including mRNA, miRNA, lncRNA, DNA methylation, and copy number variation (CNV) using colorectal cancer data from The Cancer Genome Atlas (TCGA). Then, we developed a novel deep learning framework to predict MSI status based solely on hematoxylin and eosin (H&E) staining images, and combined the H&E image with the above-mentioned molecules by multimodal compact bilinear pooling. Our results showed that there were significant differences in mRNA, miRNA, and lncRNA between the high microsatellite instability (MSI-H) patient group and the low microsatellite instability or microsatellite stability (MSI-L/MSS) patient group. By using the H&E image alone, one can predict MSI status with an acceptable prediction area under the curve (AUC) of 0.809 in 5-fold cross-validation. The fusion models integrating H&E image with a single type of molecule have higher prediction accuracies than that using H&E image alone, with the highest AUC of 0.952 achieved when combining H&E image with DNA methylation data. However, prediction accuracy will decrease when combining H&E image with all types of molecular data. In conclusion, combining H&E image with deep learning can predict the MSI status of colorectal cancer, the accuracy of which can further be improved by integrating appropriate molecular data. This study may have clinical significance in practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    单模态MRI数据不足以描述和辨别阿尔茨海默病(AD)的潜在脑部病理的原因。大多数现有研究在多组分类方面表现不佳。为了揭示结构,轻度认知障碍(MCI)和AD不同阶段的功能连通性和功能拓扑关系,提出了一种基于改进深度学习模型的区域重要性分析方法。当在分类模式中基于绝对权重比较AD和健康对照(HC)时,可以在前额叶和右半球中的扣带回区域周围观察到相关认知区域的明显漂移。先前已经报道了导致认知障碍的这些区域的改变。比较了人类大脑皮层的不同分区图集,细粒度多模态分裂HCPMMP表现最好,每个半球有180个皮质区域。在多组分类中,利用结构和功能拓扑模态作为训练模型的输入,达到的最高准确率为96.86%。具有完美辨别能力的训练模型中的权重量化了每个皮质区域的重要性。这是第一次发现这种现象,并且据我们所知,在AD及其前驱阶段中精确描述了皮质区域的重量。我们的发现可以建立其他研究模型来区分具有认知障碍的各种疾病的模式,并有助于识别潜在的病理学。
    Single modality MRI data is not enough to depict and discern the cause of the underlying brain pathology of Alzheimer\'s disease (AD). Most existing studies do not perform well with multi-group classification. To reveal the structural, functional connectivity and functional topological relationships among different stages of mild cognitive impairment (MCI) and AD, a novel method was proposed in this paper for the analysis of regional importance with an improved deep learning model. Obvious drift of related cognitive regions can be observed in the prefrontal lobe and surrounding the cingulate area in the right hemisphere when comparing AD and healthy controls (HC) based on absolute weights in the classification mode. Alterations of these regions being responsible for cognitive impairment have been previously reported. Different parcellation atlases of the human cerebral cortex were compared, and the fine-grained multimodal parcellation HCPMMP performed the best with 180 cortical areas per hemisphere. In multi-group classification, the highest accuracy achieved was 96.86% with the utilization of structural and functional topological modalities as input to the training model. Weights in the trained model with perfect discriminating ability quantify the importance of each cortical region. This is the first time such a phenomenon is discovered and weights in cortical areas are precisely described in AD and its prodromal stages to the best of our knowledge. Our findings can establish other study models to differentiate the patterns in various diseases with cognitive impairments and help to identify the underlying pathology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号