Variational Autoencoder

变分自动编码器
  • 文章类型: Journal Article
    针对生命树的先前未探索的部分的基因组测序的最近加速提出了计算挑战。从野外收集的样本通常包含来自几种生物的序列,包括目标,它的杂物,和污染物。因此需要有效的方法来分离序列。尽管测序技术的进步使这项任务变得更容易,仍然难以从数据库中没有很好代表的真核分类单元中分类分配序列。因此,仅基于参考的方法是不够的。这里,我研究了我们如何利用生物体之间序列组成的差异来识别共生体,样本中的寄生虫和污染物,对参考数据的依赖最小。为此,我探索达尔文生命之树项目的数据,包括数百套高质量的HiFi阅读昆虫。可视化由变分自动编码器学习的读段四核苷酸组成的二维表示可以揭示样品的不同组分。用附加信息注释嵌入,比如编码密度,估计覆盖率,或分类标签允许快速评估数据集的内容。这种方法可以扩展到数百万个序列,使探索未组装的阅读集成为可能,即使是大基因组。结合交互式可视化工具,它允许通过基于参考的筛查报告的大部分cobionts被识别。至关重要的是,它还有助于检索缺少合适参考数据的基因组。
    The recent acceleration in genome sequencing targeting previously unexplored parts of the tree of life presents computational challenges. Samples collected from the wild often contain sequences from several organisms, including the target, its cobionts, and contaminants. Effective methods are therefore needed to separate sequences. Though advances in sequencing technology make this task easier, it remains difficult to taxonomically assign sequences from eukaryotic taxa that are not well-represented in databases. Therefore, reference-based methods alone are insufficient. Here, I examine how we can take advantage of differences in sequence composition between organisms to identify symbionts, parasites and contaminants in samples, with minimal reliance on reference data. To this end, I explore data from the Darwin Tree of Life project, including hundreds of high-quality HiFi read sets from insects. Visualising two-dimensional representations of read tetranucleotide composition learned by a Variational Autoencoder can reveal distinct components of a sample. Annotating the embeddings with additional information, such as coding density, estimated coverage, or taxonomic labels allows rapid assessment of the contents of a dataset. The approach scales to millions of sequences, making it possible to explore unassembled read sets, even for large genomes. Combined with interactive visualisation tools, it allows a large fraction of cobionts reported by reference-based screening to be identified. Crucially, it also facilitates retrieving genomes for which suitable reference data are absent.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在全球范围内,精神障碍已被列为造成负担的十大常见原因之一。生成人工智能(GAI)已经成为一种有前途和创新的技术进步,在精神卫生保健领域具有巨大的潜力。然而,缺乏专门研究和了解GAI在该领域内的应用前景的研究。
    目的:本综述旨在通过整合相关文献,了解GAI知识的现状,并确定其在心理健康领域的关键用途。
    方法:在包括WebofScience在内的8个知名来源中搜索了记录,PubMed,IEEEXplore,medRxiv,bioRxiv,谷歌学者,2013年至2023年的CNKI和万方数据库。我们的重点是原创,使用GAI技术有益于心理健康的英文或中文出版物进行实证研究。为了进行详尽的搜索,我们还检查了相关文献引用的研究。两名审查人员负责数据选择过程,根据所使用的GAI方法(传统检索和基于规则的技术与先进的GAI技术),对所有提取的数据进行了综合和总结,以进行简短深入的分析。
    结果:在对144篇文章的评论中,44(30.6%)符合详细分析的纳入标准。出现了高级GAI的六个关键用途:精神障碍检测,咨询支持,治疗应用,临床培训,临床决策支持,和目标驱动的优化。先进的GAI系统主要集中在治疗应用(n=19,43%)和咨询支持(n=13,30%),临床培训是最不常见的。大多数研究(n=28,64%)广泛关注心理健康,而特定条件如焦虑(n=1,2%),双相情感障碍(n=2,5%),饮食失调(n=1,2%),创伤后应激障碍(n=2,5%),精神分裂症(n=1,2%)受到的关注有限。尽管普遍使用,ChatGPT在检测精神障碍方面的功效仍然不足.此外,发现了100篇关于传统GAI方法的文章,表明先进的GAI可以增强精神卫生保健的不同领域。
    结论:本研究全面概述了GAI在精神保健中的应用,作为未来研究的宝贵指南,实际应用,以及这一领域的政策制定。虽然GAI在加强精神卫生保健服务方面表现出了希望,其固有的局限性强调了其作为补充工具的作用,而不是替代训练有素的心理健康提供者。有必要对GAI技术进行认真和道德的整合,确保采取平衡的方法,最大限度地提高利益,同时减轻精神卫生保健实践中的潜在挑战。
    BACKGROUND: Mental disorders have ranked among the top 10 prevalent causes of burden on a global scale. Generative artificial intelligence (GAI) has emerged as a promising and innovative technological advancement that has significant potential in the field of mental health care. Nevertheless, there is a scarcity of research dedicated to examining and understanding the application landscape of GAI within this domain.
    OBJECTIVE: This review aims to inform the current state of GAI knowledge and identify its key uses in the mental health domain by consolidating relevant literature.
    METHODS: Records were searched within 8 reputable sources including Web of Science, PubMed, IEEE Xplore, medRxiv, bioRxiv, Google Scholar, CNKI and Wanfang databases between 2013 and 2023. Our focus was on original, empirical research with either English or Chinese publications that use GAI technologies to benefit mental health. For an exhaustive search, we also checked the studies cited by relevant literature. Two reviewers were responsible for the data selection process, and all the extracted data were synthesized and summarized for brief and in-depth analyses depending on the GAI approaches used (traditional retrieval and rule-based techniques vs advanced GAI techniques).
    RESULTS: In this review of 144 articles, 44 (30.6%) met the inclusion criteria for detailed analysis. Six key uses of advanced GAI emerged: mental disorder detection, counseling support, therapeutic application, clinical training, clinical decision-making support, and goal-driven optimization. Advanced GAI systems have been mainly focused on therapeutic applications (n=19, 43%) and counseling support (n=13, 30%), with clinical training being the least common. Most studies (n=28, 64%) focused broadly on mental health, while specific conditions such as anxiety (n=1, 2%), bipolar disorder (n=2, 5%), eating disorders (n=1, 2%), posttraumatic stress disorder (n=2, 5%), and schizophrenia (n=1, 2%) received limited attention. Despite prevalent use, the efficacy of ChatGPT in the detection of mental disorders remains insufficient. In addition, 100 articles on traditional GAI approaches were found, indicating diverse areas where advanced GAI could enhance mental health care.
    CONCLUSIONS: This study provides a comprehensive overview of the use of GAI in mental health care, which serves as a valuable guide for future research, practical applications, and policy development in this domain. While GAI demonstrates promise in augmenting mental health care services, its inherent limitations emphasize its role as a supplementary tool rather than a replacement for trained mental health providers. A conscientious and ethical integration of GAI techniques is necessary, ensuring a balanced approach that maximizes benefits while mitigating potential challenges in mental health care practices.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    肺炎是最普遍的肺部疾病之一,由于它是可能导致全世界死亡的疾病之一,因此引起了极大的关注。诊断肺炎需要胸部X光检查和大量专业知识,以确保准确的评估。尽管横向X射线在提供额外的诊断信息与正面X射线一起发挥了关键作用,它们没有被广泛使用。从多个角度获取X射线至关重要,显著提高了疾病诊断的精度。在本文中,我们提出了一种多视图多特征融合模型(MV-MFF),该模型集成了变分自编码器和β变分自编码器的潜在表示。我们的模型旨在使用多视角X射线对肺炎的存在进行分类。实验结果表明,MV-MFF模型的精度为80.4%,曲线下面积为0.775,优于当前最先进的方法。这些发现强调了我们的方法通过多视角X射线分析改善肺炎诊断的有效性。
    Pneumonia ranks among the most prevalent lung diseases and poses a significant concern since it is one of the diseases that may lead to death around the world. Diagnosing pneumonia necessitates a chest X-ray and substantial expertise to ensure accurate assessments. Despite the critical role of lateral X-rays in providing additional diagnostic information alongside frontal X-rays, they have not been widely used. Obtaining X-rays from multiple perspectives is crucial, significantly improving the precision of disease diagnosis. In this paper, we propose a multi-view multi-feature fusion model (MV-MFF) that integrates latent representations from a variational autoencoder and a β-variational autoencoder. Our model aims to classify pneumonia presence using multi-view X-rays. Experimental results demonstrate that the MV-MFF model achieves an accuracy of 80.4% and an area under the curve of 0.775, outperforming current state-of-the-art methods. These findings underscore the efficacy of our approach in improving pneumonia diagnosis through multi-view X-ray analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自动质量控制(QC)系统对于确保流线型的头部计算机断层扫描(CT)扫描解释不影响后续图像分析至关重要。与当前的人类QC协议相比,这样的系统是有利的,这是主观和耗时的。在这项工作中,我们的目标是开发一个基于深度学习的框架,将扫描分类为可用或不可用的质量。有监督的深度学习模型在分类任务中非常有效,但是它们非常复杂,需要很大,注释的数据进行有效的训练。QC数据集的其他挑战包括-1)类不平衡-可用案例远远超过不可用案例,以及2)弱标签-扫描级别标签可能与切片级别标签不匹配。所提出的框架利用这些弱标签来增强标准的异常检测技术。具体来说,我们提出了一个由变分自动编码器(VAE)和暹罗神经网络(SNN)组成的混合模型。在对VAE进行训练以了解可用扫描的出现方式并重建输入扫描时,SNN比较此输入扫描与其重建的相似程度,并标记与阈值不太相似的那些。与依赖于基于强度的度量如均方根误差(RMSE)的典型异常检测方法相比,所提出的方法更适合于捕获两类数据之间的非线性特征结构的差异。与使用多种分类度量的最先进的异常检测方法进行比较,可以确定所提出的框架在标记劣质扫描以供放射科医师审查方面的优越性。从而减少他们的工作量并建立可靠和一致的数据流。
    An automated quality control (QC) system is essential to ensure streamlined head computed tomography (CT) scan interpretations that do not affect subsequent image analysis. Such a system is advantageous compared to current human QC protocols, which are subjective and time-consuming. In this work, we aim to develop a deep learning-based framework to classify a scan to be of usable or unusable quality. Supervised deep learning models have been highly effective in classification tasks, but they are highly complex and require large, annotated data for effective training. Additional challenges with QC datasets include - 1) class-imbalance - usable cases far exceed the unusable ones and 2) weak-labels - scan level labels may not match slice level labels. The proposed framework utilizes these weak labels to augment a standard anomaly detection technique. Specifically, we proposed a hybrid model that consists of a variational autoencoder (VAE) and a Siamese Neural Network (SNN). While the VAE is trained to learn how usable scans appear and reconstruct an input scan, the SNN compares how similar this input scan is to its reconstruction and flags the ones that are less similar than a threshold. The proposed method is more suited to capture the differences in non-linear feature structure between the two classes of data than typical anomaly detection methods that depend on intensity-based metrics like root mean square error (RMSE). Comparison with state-of-the-art anomaly detection methods using multiple classification metrics establishes superiority of the proposed framework in flagging inferior quality scans for review by radiologists, thus reducing their workload and establishing a reliable and consistent dataflow.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:数据缺失是基于质谱的代谢组学中的一个共同挑战,这可能导致有偏见和不完整的分析。将全基因组测序(WGS)数据与代谢组学数据整合已成为一种有希望的方法,可提高代谢组学研究中数据填补的准确性。
    方法:在本研究中,我们提出了一种新的方法,利用WGS数据和参考代谢物的信息来估算未知的代谢物。我们的方法利用多尺度变分自动编码器来联合建模负担分数,多遗传风险评分(PGS),和连锁不平衡(LD)修剪的单核苷酸多态性(SNP)用于特征提取和缺失的代谢组学数据填补。通过学习两个组学数据的潜在表示,我们的方法可以基于基因组信息有效地估算缺失的代谢组学值.
    结果:我们评估了我们的方法在具有缺失值的经验代谢组学数据集上的性能,并证明了其与常规插补技术相比的优越性。使用35种模板代谢物得出的负担评分,PGS和LD修剪的SNP,对于71.55%的代谢物,所提出的方法的R2得分>0.01.
    结论:在代谢组学插补中整合WGS数据不仅提高了数据完整性,而且增强了下游分析,为更全面和准确的代谢途径和疾病关联研究铺平了道路。我们的发现为利用WGS数据进行代谢组学数据插补的潜在好处提供了有价值的见解,并强调了在精准医学研究中利用多模式数据集成的重要性。
    BACKGROUND: Missing data is a common challenge in mass spectrometry-based metabolomics, which can lead to biased and incomplete analyses. The integration of whole-genome sequencing (WGS) data with metabolomics data has emerged as a promising approach to enhance the accuracy of data imputation in metabolomics studies.
    METHODS: In this study, we propose a novel method that leverages the information from WGS data and reference metabolites to impute unknown metabolites. Our approach utilizes a multi-scale variational autoencoder to jointly model the burden score, polygenetic risk score (PGS), and linkage disequilibrium (LD) pruned single nucleotide polymorphisms (SNPs) for feature extraction and missing metabolomics data imputation. By learning the latent representations of both omics data, our method can effectively impute missing metabolomics values based on genomic information.
    RESULTS: We evaluate the performance of our method on empirical metabolomics datasets with missing values and demonstrate its superiority compared to conventional imputation techniques. Using 35 template metabolites derived burden scores, PGS and LD-pruned SNPs, the proposed methods achieved R2-scores > 0.01 for 71.55 % of metabolites.
    CONCLUSIONS: The integration of WGS data in metabolomics imputation not only improves data completeness but also enhances downstream analyses, paving the way for more comprehensive and accurate investigations of metabolic pathways and disease associations. Our findings offer valuable insights into the potential benefits of utilizing WGS data for metabolomics data imputation and underscore the importance of leveraging multi-modal data integration in precision medicine research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在实际工程中,获得标记的高质量故障样本提出了挑战。传统的基于深度学习的故障诊断方法难以从细粒度的角度辨别机械故障的根本原因,由于注释数据的稀缺性。为了解决这些问题,我们提出了一种新的半监督高斯混合变分自编码器方法,SeGMVAE,旨在获取可以跨细粒度故障诊断任务传输的无监督表示,启用仅使用少量标记的样本来识别先前未发现的故障。最初,高斯混合被引入作为变分自动编码器的多峰先验分布。通过期望最大化(EM)算法为每个任务动态优化此分布,构建桥接任务和未标记样本的潜在表示。随后,提出了一套变分后验方法,将每个任务样本编码到潜在空间中,促进元学习。最后,半监督EM通过获取特定任务的参数来集成标记数据的后验,以诊断看不见的故障。两个实验的结果表明,SeGMVAE擅长识别新的细粒度故障,并在跨不同机器的跨域故障诊断中表现出出色的性能。我们的代码可在https://github.com/zhiqan/SeGMVAE获得。
    In practical engineering, obtaining labeled high-quality fault samples poses challenges. Conventional fault diagnosis methods based on deep learning struggle to discern the underlying causes of mechanical faults from a fine-grained perspective, due to the scarcity of annotated data. To tackle those issue, we propose a novel semi-supervised Gaussian Mixed Variational Autoencoder method, SeGMVAE, aimed at acquiring unsupervised representations that can be transferred across fine-grained fault diagnostic tasks, enabling the identification of previously unseen faults using only the small number of labeled samples. Initially, Gaussian mixtures are introduced as a multimodal prior distribution for the Variational Autoencoder. This distribution is dynamically optimized for each task through an expectation-maximization (EM) algorithm, constructing a latent representation of the bridging task and unlabeled samples. Subsequently, a set variational posterior approach is presented to encode each task sample into the latent space, facilitating meta-learning. Finally, semi-supervised EM integrates the posterior of labeled data by acquiring task-specific parameters for diagnosing unseen faults. Results from two experiments demonstrate that SeGMVAE excels in identifying new fine-grained faults and exhibits outstanding performance in cross-domain fault diagnosis across different machines. Our code is available at https://github.com/zhiqan/SeGMVAE.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    射线照相在医疗保健中起着重要的作用,准确的定位对于提供最佳质量的图像至关重要。诊断价值不足的射线照片被拒绝,需要重拍。然而,确定重新拍摄射线照片的适用性是一种定性评估。
    使用基于无监督学习的自动编码器(AE)和变分自动编码器(VAE)自动评估颅骨射线照片的准确性。在这项研究中,我们取消了视觉定性评估,并使用无监督学习从定量评估中识别颅骨射线照相重拍。
    在射线照片上拍摄了五个头骨体模,并获得了1,680张图像。这些图像对应于两类:在适当位置捕获的正常图像和在不适当位置捕获的图像。本研究使用异常检测方法验证了颅骨X光片的辨别能力。
    AE和VAE的曲线下面积分别为0.7060和0.6707,在接收机工作特性分析中。我们提出的方法显示出比以前的研究更高的辨别能力,准确率为52%。
    我们的发现表明,所提出的方法在确定重新拍摄颅骨射线照片的适用性方面具有很高的分类准确性。最佳图像考虑的自动化,是否重新拍摄射线照片,有助于在繁忙的X射线成像操作中提高操作效率。
    UNASSIGNED: Radiography plays an important role in medical care, and accurate positioning is essential for providing optimal quality images. Radiographs with insufficient diagnostic value are rejected, and retakes are required. However, determining the suitability of retaking radiographs is a qualitative evaluation.
    UNASSIGNED: To evaluate skull radiograph accuracy automatically using an unsupervised learning-based autoencoder (AE) and a variational autoencoder (VAE). In this study, we eliminated visual qualitative evaluation and used unsupervised learning to identify skull radiography retakes from the quantitative evaluation.
    UNASSIGNED: Five skull phantoms were imaged on radiographs, and 1,680 images were acquired. These images correspond to two categories: normal images captured at appropriate positions and images captured at inappropriate positions. This study verified the discriminatory ability of skull radiographs using anomaly detection methods.
    UNASSIGNED: The areas under the curves for AE and VAE were 0.7060 and 0.6707, respectively, in receiver operating characteristic analysis. Our proposed method showed a higher discrimination ability than those of previous studies which had an accuracy of 52%.
    UNASSIGNED: Our findings suggest that the proposed method has high classification accuracy in determining the suitability of retaking skull radiographs. Automation of optimal image consideration, whether or not to retake radiographs, contributes to improving operational efficiency in busy X-ray imaging operations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    人类在身体互动中预测他人意图的非凡能力在很小的时候就发展起来了,对发展至关重要。意图预测,定义为同时识别和产生人与人之间的相互作用,有许多应用,如辅助机器人,人机交互,视频和机器人监控,和自动驾驶。然而,解决问题的模型很少。本文提出了两个基于注意力的代理模型,通过一系列瞥见对其进行采样来预测与3D骨架交互的意图。这些代理模型的新颖之处在于它们本质上是多模态的,由知觉和本体感受途径组成。动作(注意)由代理的生成错误驱动,而不是强化。在每个采样时刻,代理完成部分观察到的骨骼运动并推断交互类。它通过最小化生成和分类错误来学习采样的位置和内容。在基准数据集上对我们的模型进行了广泛的评估,并与用于意图预测的最新模型进行了比较,这表明,即使我们的模型包含较少的可训练参数,所提出的模型之一的分类和生成精度也可以与现有技术相媲美。从我们的模型设计中获得的见解可以为高效代理的开发提供信息,人工智能(AI)的未来。
    The remarkable human ability to predict others\' intent during physical interactions develops at a very early age and is crucial for development. Intent prediction, defined as the simultaneous recognition and generation of human-human interactions, has many applications such as in assistive robotics, human-robot interaction, video and robotic surveillance, and autonomous driving. However, models for solving the problem are scarce. This paper proposes two attention-based agent models to predict the intent of interacting 3D skeletons by sampling them via a sequence of glimpses. The novelty of these agent models is that they are inherently multimodal, consisting of perceptual and proprioceptive pathways. The action (attention) is driven by the agent\'s generation error, and not by reinforcement. At each sampling instant, the agent completes the partially observed skeletal motion and infers the interaction class. It learns where and what to sample by minimizing the generation and classification errors. Extensive evaluation of our models is carried out on benchmark datasets and in comparison to a state-of-the-art model for intent prediction, which reveals that classification and generation accuracies of one of the proposed models are comparable to those of the state of the art even though our model contains fewer trainable parameters. The insights gained from our model designs can inform the development of efficient agents, the future of artificial intelligence (AI).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    变分自动编码器(VAE)是与生成的网络耦合的有效变分推断技术。由于变分推断提供的不确定性,VAE已应用于医学图像配准。然而,VAE中的一个关键问题是简单先验不能提供合适的正则化,这导致后验和先验之间的不匹配。最优先验可以缩小证据的真实和变分后验之间的差距。在本文中,我们提出了一种多阶段VAE来学习最优先验,这是聚合的后验。轻量级VAE用于生成作为整体的聚合后验。基于VAE的医学图像配准中常见的高维聚合后验分布估计是一种有效的方法。训练因式分解的伸缩分类器,以估计简单给定的先验和聚合后验的密度比,旨在更准确地计算变分和聚合后验之间的KL发散。我们分析了KL散度,发现因式分解越细,KL发散越小。然而,太细的分区不利于配准精度。此外,变分后验协方差的对角假设忽略了图像配准中潜在变量之间的关系。为了解决这个问题,我们学习具有低秩信息的协方差矩阵,以实现与变分后验的每个维度的相关性。协方差矩阵进一步用作降低变形场不确定性的度量。在四个公共医学图像数据集上的实验结果表明,我们提出的方法在负对数似然(NLL)方面优于其他方法,并获得了更好的配准精度。
    Variational Autoencoders (VAEs) are an efficient variational inference technique coupled with the generated network. Due to the uncertainty provided by variational inference, VAEs have been applied in medical image registration. However, a critical problem in VAEs is that the simple prior cannot provide suitable regularization, which leads to the mismatch between the variational posterior and prior. An optimal prior can close the gap between the evidence\'s real and variational posterior. In this paper, we propose a multi-stage VAE to learn the optimal prior, which is the aggregated posterior. A lightweight VAE is used to generate the aggregated posterior as a whole. It is an effective way to estimate the distribution of the high-dimensional aggregated posterior that commonly exists in medical image registration based on VAEs. A factorized telescoping classifier is trained to estimate the density ratio of a simple given prior and aggregated posterior, aiming to calculate the KL divergence between the variational and aggregated posterior more accurately. We analyze the KL divergence and find that the finer the factorization, the smaller the KL divergence is. However, too fine a partition is not conducive to registration accuracy. Moreover, the diagonal hypothesis of the variational posterior\'s covariance ignores the relationship between latent variables in image registration. To address this issue, we learn a covariance matrix with low-rank information to enable correlations with each dimension of the variational posterior. The covariance matrix is further used as a measure to reduce the uncertainty of deformation fields. Experimental results on four public medical image datasets demonstrate that our proposed method outperforms other methods in negative log-likelihood (NLL) and achieves better registration accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    微生物-药物关联的识别可以极大地促进药物研发。用于筛选微生物-药物关联的传统方法是耗时的,人力密集型,而且行为成本很高,所以计算方法是一个很好的选择。然而,他们中的大多数忽略了丰富序列的组合,结构信息,和微生物-药物网络拓扑。
    在这项研究中,我们开发了一个基于改进型图注意力变分自编码器(MGAVAEMDA)的计算框架,通过将生物信息与变分自编码器相结合来推断潜在的微药物关联.在MGAVAEMDA,我们首先使用了多个数据库,其中包括微生物序列,药物结构,和微生物-药物关联数据库,经过多次相似度计算,建立微生物和药物的两个综合特征矩阵,聚变,平滑,和阈值。然后,我们采用了变分自动编码器和图形注意力的组合来提取微生物和药物的低维特征表示。最后,将低维特征表示和图形邻接矩阵输入随机森林分类器,以获得微生物-药物关联评分,从而识别潜在的微生物-药物关联.此外,为了校正模型复杂性和冗余计算以提高效率,我们引入了一个改进的图卷积神经网络嵌入到变分自动编码器用于计算低维特征。
    实验结果表明,MGAVAEMDA的预测性能优于五种最先进的方法。对于主要测量(AUC=0.9357,AUPR=0.9378),与次优方法相比,MGAVAEMDA的相对改进分别为1.76%和1.47%,分别。
    我们对两种药物进行了案例研究,发现PubMed中已报道了超过85%的预测关联。综合实验结果验证了我们模型在准确推断潜在微生物-药物关联方面的可靠性。
    UNASSIGNED: The identification of microbe-drug associations can greatly facilitate drug research and development. Traditional methods for screening microbe-drug associations are time-consuming, manpower-intensive, and costly to conduct, so computational methods are a good alternative. However, most of them ignore the combination of abundant sequence, structural information, and microbe-drug network topology.
    UNASSIGNED: In this study, we developed a computational framework based on a modified graph attention variational autoencoder (MGAVAEMDA) to infer potential microbedrug associations by combining biological information with the variational autoencoder. In MGAVAEMDA, we first used multiple databases, which include microbial sequences, drug structures, and microbe-drug association databases, to establish two comprehensive feature matrices of microbes and drugs after multiple similarity computations, fusion, smoothing, and thresholding. Then, we employed a combination of variational autoencoder and graph attention to extract low-dimensional feature representations of microbes and drugs. Finally, the lowdimensional feature representation and graphical adjacency matrix were input into the random forest classifier to obtain the microbe-drug association score to identify the potential microbe-drug association. Moreover, in order to correct the model complexity and redundant calculation to improve efficiency, we introduced a modified graph convolutional neural network embedded into the variational autoencoder for computing low dimensional features.
    UNASSIGNED: The experiment results demonstrate that the prediction performance of MGAVAEMDA is better than the five state-of-the-art methods. For the major measurements (AUC =0.9357, AUPR =0.9378), the relative improvements of MGAVAEMDA compared to the suboptimal methods are 1.76 and 1.47%, respectively.
    UNASSIGNED: We conducted case studies on two drugs and found that more than 85% of the predicted associations have been reported in PubMed. The comprehensive experimental results validated the reliability of our models in accurately inferring potential microbe-drug associations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号