feature embedding

  • 文章类型: Journal Article
    人类有能力不断学习新知识。然而,对于人工智能,不断学习新知识通常会导致灾难性的遗忘,现有的基于正则化和基于动态结构的方法已经显示出很大的缓解潜力。然而,这些方法有一定的局限性。他们通常不充分考虑不兼容特征嵌入的问题。相反,他们往往只关注新的或以前的类的特征,而没有全面考虑整个模型。因此,我们提出了一个两阶段学习范式来解决特征嵌入不兼容问题。具体来说,我们保留以前的模型,并在第一阶段冻结其所有参数,同时动态扩展新模块以缓解功能嵌入不兼容问题。在第二阶段,融合知识蒸馏方法用于压缩冗余特征尺寸。此外,我们提出了权重修剪和合并方法来提高模型的效率。我们在CIFAR-100,ImageNet-100和ImageNet-1000基准测试数据集上获得的实验结果表明,所提出的方法在所有比较方法中都取得了最佳性能。例如,在ImageNet-100数据集上,最大精度提高5.08%。代码可在https://github.com/ybyangjing/CIL-FCE获得。
    Humans have the ability to constantly learn new knowledge. However, for artificial intelligence, trying to continuously learn new knowledge usually results in catastrophic forgetting, the existing regularization-based and dynamic structure-based approaches have shown great potential for alleviating. Nevertheless, these approaches have certain limitations. They usually do not fully consider the problem of incompatible feature embeddings. Instead, they tend to focus only on the features of new or previous classes and fail to comprehensively consider the entire model. Therefore, we propose a two-stage learning paradigm to solve feature embedding incompatibility problems. Specifically, we retain the previous model and freeze all its parameters in the first stage while dynamically expanding a new module to alleviate feature embedding incompatibility questions. In the second stage, a fusion knowledge distillation approach is used to compress the redundant feature dimensions. Moreover, we propose weight pruning and consolidation approaches to improve the efficiency of the model. Our experimental results obtained on the CIFAR-100, ImageNet-100 and ImageNet-1000 benchmark datasets show that the proposed approaches achieve the best performance among all the compared approaches. For example, on the ImageNet-100 dataset, the maximal accuracy improvement is 5.08%. Code is available at https://github.com/ybyangjing/CIL-FCE.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这项研究通过深入的分子对接分析,探讨了基于生育酚的纳米乳液作为心血管疾病(CVD)治疗剂的潜力。该研究的重点是阐明生育酚与七个关键蛋白之间的分子相互作用(1O8a,4YAY,4DLI,1HW9,2YCW,1BO9和1CX2)在CVD发展中起关键作用。通过严格的硅对接调查,对具有约束力的亲和力进行了评估,生育酚与这些靶蛋白的抑制潜力和相互作用模式。这些发现揭示了重要的相互作用,特别是4YAY,显示-6.39kcal/mol的稳健结合能和20.84μM的有希望的Ki值。还观察到与1HW9,4DLI,2YCW和1CX2,进一步表明生育酚的潜在治疗相关性。相比之下,没有观察到与1BO9的相互作用。此外,对与生育酚结合的4YAY的常见残基进行了检查,突出了有助于相互作用稳定性的关键分子间疏水键。生育酚符合药代动力学(Lipinski's和Veber's)的口服生物利用度规则,并证明安全无毒和非致癌。因此,利用基于深度学习的蛋白质语言模型ESM1-b和ProtT5进行输入编码,以预测4YAY蛋白质和生育酚之间的相互作用位点。因此,对这些关键的蛋白质-配体相互作用进行了高度准确的预测。这项研究不仅促进了对这些相互作用的理解,而且突出了深度学习在分子生物学和药物发现方面的巨大潜力。它强调了生育酚作为心血管疾病管理候选人的承诺,揭示其分子相互作用和与生物分子样特征的相容性。
    This research delves into the exploration of the potential of tocopherol-based nanoemulsion as a therapeutic agent for cardiovascular diseases (CVD) through an in-depth molecular docking analysis. The study focuses on elucidating the molecular interactions between tocopherol and seven key proteins (1O8a, 4YAY, 4DLI, 1HW9, 2YCW, 1BO9 and 1CX2) that play pivotal roles in CVD development. Through rigorous in silico docking investigations, assessment was conducted on the binding affinities, inhibitory potentials and interaction patterns of tocopherol with these target proteins. The findings revealed significant interactions, particularly with 4YAY, displaying a robust binding energy of -6.39 kcal/mol and a promising Ki value of 20.84 μM. Notable interactions were also observed with 1HW9, 4DLI, 2YCW and 1CX2, further indicating tocopherol\'s potential therapeutic relevance. In contrast, no interaction was observed with 1BO9. Furthermore, an examination of the common residues of 4YAY bound to tocopherol was carried out, highlighting key intermolecular hydrophobic bonds that contribute to the interaction\'s stability. Tocopherol complies with pharmacokinetics (Lipinski\'s and Veber\'s) rules for oral bioavailability and proves safety non-toxic and non-carcinogenic. Thus, deep learning-based protein language models ESM1-b and ProtT5 were leveraged for input encodings to predict interaction sites between the 4YAY protein and tocopherol. Hence, highly accurate predictions of these critical protein-ligand interactions were achieved. This study not only advances the understanding of these interactions but also highlights deep learning\'s immense potential in molecular biology and drug discovery. It underscores tocopherol\'s promise as a cardiovascular disease management candidate, shedding light on its molecular interactions and compatibility with biomolecule-like characteristics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    DNAN6-甲基腺嘌呤(6mA)是最常见和最丰富的修饰之一,在各种生物过程和细胞功能中起着至关重要的作用。因此,DNA6mA位点的准确鉴定对于更好地理解其调控机制和生物学功能具有重要意义。虽然取得了重大进展,DNA序列中6mA位点预测仍有进一步改进的空间。在这项研究中,我们报告了一个智能但准确的6mA预测器,称为SNN6mA,使用暹罗网络。具体而言,首先使用独热编码方案将DNA片段编码为特征向量;然后,将这些原始特征向量映射到从暹罗网络导出的低维嵌入空间,以捕获更多的判别特征;最后,将获得的低维特征馈送到完全连接的神经网络以执行最终预测。对两个物种的数据集进行严格的基准测试表明,所提出的SNN6mA优于最先进的6mA预测因子。详细的数据分析表明,SNN6mA的主要优势在于利用暹罗网络,它可以将原始特征映射到具有更多辨别能力的低维嵌入空间。总之,拟议的SNN6mA是首次尝试使用Siamese网络进行6mA站点预测,并且可以很容易地扩展到预测其他类型的修改。研究中使用的代码和数据集可在https://github.com/YuXuan-Glasgow/SNN6mA上免费获得,以供学术使用。
    DNA N6-methyladenine (6mA) is one of the most common and abundant modifications, which plays essential roles in various biological processes and cellular functions. Therefore, the accurate identification of DNA 6mA sites is of great importance for a better understanding of its regulatory mechanisms and biological functions. Although significant progress has been made, there still has room for further improvement in 6mA site prediction in DNA sequences. In this study, we report a smart but accurate 6mA predictor, termed as SNN6mA, using Siamese network. To be specific, DNA segments are firstly encoded into feature vectors using the one-hot encoding scheme; then, these original feature vectors are mapped to a low-dimensional embedding space derived from Siamese network to capture more discriminative features; finally, the obtained low-dimensional features are fed to a fully connected neural network to perform final prediction. Stringent benchmarking tests on the datasets of two species demonstrated that the proposed SNN6mA is superior to the state-of-the-art 6mA predictors. Detailed data analyses show that the major advantage of SNN6mA lies in the utilization of Siamese network, which can map the original features into a low-dimensional embedding space with more discriminative capability. In summary, the proposed SNN6mA is the first attempt to use Siamese network for 6mA site prediction and could be easily extended to predict other types of modifications. The codes and datasets used in the study are freely available at https://github.com/YuXuan-Glasgow/SNN6mA for academic use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:深度学习(DL)模型已经实现了最先进的医学诊断分类准确性。当前的模型受到离散诊断标签的限制,但可以在连续规模的诊断中产生更多信息。通过将DL分类模型与均匀流形近似和投影(UMAP)相结合,我们开发了一种用于II型黄斑毛细血管扩张症(MacTel)的新型连续严重程度缩放系统。
    方法:我们使用DL网络从离散严重性标签中学习MacTel严重性的特征表示,并应用UMAP将此特征表示嵌入到二维中,从而创建连续的MacTel严重程度量表。
    方法:分析了1089名MacTel项目参与者的2003年OCT卷。
    方法:我们使用来自OCT体积的多个B扫描来训练多视图DL分类器,以学习离散的7步Chew等人。MacTel严重程度量表。提取分类器的最后一个特征层作为UMAP的输入,将这些特征嵌入到连续的2D流形中。在测试准确性方面评估DL分类器。连续UMAP量表与Chew等人的等级相关性。已计算。此外,根据kappa协议,对5名临床专家对100对患者体积的UMAP量表进行了评估.对于每一对患者体积,临床专家被要求选择更严重的MacTel疾病的体积,并与UMAP量表进行了比较。
    方法:DL分类器的分类精度,和Kappa协议与UMAP的临床专家。
    结果:多视图DL分类器在保留的测试OCT体积上实现了63.3%(186/294)的前1准确度。UMAP指标显示了MacTel严重程度的清晰连续等级,与Chew等人的Spearman等级相关性为0.84。规模。此外,连续UMAP指标与5位临床专家达成了0.56-0.63的卡帕协议,这与观察者之间的kappas相当。
    结论:我们的UMAP嵌入产生了连续的MacTel严重程度量表,不需要连续的培训标签。这种技术可以应用于其他疾病,并可能导致更准确的诊断,提高对疾病进展和病理关键影像学特征的认识。
    OBJECTIVE: Deep learning (DL) models have achieved state-of-the-art medical diagnosis classification accuracy. Current models are limited by discrete diagnosis labels, but could yield more information with diagnosis in a continuous scale. We developed a novel continuous severity scaling system for macular telangiectasia (MacTel) type 2 by combining a DL classification model with uniform manifold approximation and projection (UMAP).
    METHODS: We used a DL network to learn a feature representation of MacTel severity from discrete severity labels and applied UMAP to embed this feature representation into 2 dimensions, thereby creating a continuous MacTel severity scale.
    METHODS: A total of 2003 OCT volumes were analyzed from 1089 MacTel Project participants.
    METHODS: We trained a multiview DL classifier using multiple B-scans from OCT volumes to learn a previously published discrete 7-step MacTel severity scale. The classifiers\' last feature layer was extracted as input for UMAP, which embedded these features into a continuous 2-dimensional manifold. The DL classifier was assessed in terms of test accuracy. Rank correlation for the continuous UMAP scale against the previously published scale was calculated. Additionally, the UMAP scale was assessed in the κ agreement against 5 clinical experts on 100 pairs of patient volumes. For each pair of patient volumes, clinical experts were asked to select the volume with more severe MacTel disease and to compare them against the UMAP scale.
    METHODS: Classification accuracy for the DL classifier and κ agreement versus clinical experts for UMAP.
    RESULTS: The multiview DL classifier achieved top 1 accuracy of 63.3% (186/294) on held-out test OCT volumes. The UMAP metric showed a clear continuous gradation of MacTel severity with a Spearman rank correlation of 0.84 with the previously published scale. Furthermore, the continuous UMAP metric achieved κ agreements of 0.56 to 0.63 with 5 clinical experts, which was comparable with interobserver κ values.
    CONCLUSIONS: Our UMAP embedding generated a continuous MacTel severity scale, without requiring continuous training labels. This technique can be applied to other diseases and may lead to more accurate diagnosis, improved understanding of disease progression, and key imaging features for pathologic characteristics.
    BACKGROUND: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    环状RNA(circularRNAs,circRNAs)是一类具有闭合环状结构的非编码RNA分子。它们已被证明在减少许多疾病中起着重要作用。此外,许多临床诊断和治疗疾病的研究表明,circRNA可以被认为是一种潜在的生物标志物。因此,了解circRNA与疾病的关联有助于预测某些生命活动的紊乱。然而,传统的生物实验方法耗时。基于机器学习的circRNA-疾病关联预测最常见的方法可以避免这种情况,这依赖于不同的数据。然而,这些方法通常不涉及circRNA和疾病的拓扑信息。此外,circRNAs可以通过miRNA与疾病相关。有了这些考虑,我们提出了一种新的方法,名为THGNCDA,预测circRNAs和疾病之间的关联。具体来说,对于一对circRNA和疾病,我们使用一个图神经网络来学习它的每个邻居的重要性。此外,我们使用多层卷积神经网络根据circRNA-疾病对的属性来探索它们的关系。在计算嵌入时,我们介绍miRNAs的信息。实验结果表明,THGNCDA优于SOTA方法。此外,可以看出,我们的方法给出了更好的召回率。为了确认注意力的重要性,我们进行了广泛的消融研究.膀胱和前列腺肿瘤的个案研究进一步显示THGNCDA在发现circRNA候选物与疾病之间已知关系方面的能力。
    Circular RNAs (circRNAs) are a class of noncoding RNA molecules featuring a closed circular structure. They have been proved to play a significant role in the reduction of many diseases. Besides, many researches in clinical diagnosis and treatment of disease have revealed that circRNA can be considered as a potential biomarker. Therefore, understanding the association of circRNA and diseases can help to forecast some disorders of life activities. However, traditional biological experimental methods are time-consuming. The most common method for circRNA-disease association prediction on the basis of machine learning can avoid this, which relies on diverse data. Nevertheless, topological information of circRNA and disease usually is not involved in these methods. Moreover, circRNAs can be associated with diseases through miRNAs. With these considerations, we proposed a novel method, named THGNCDA, to predict the association between circRNAs and diseases. Specifically, for a certain pair of circRNA and disease, we employ a graph neural network with attention to learn the importance of its each neighbor. In addition, we use a multilayer convolutional neural network to explore the relationship of a circRNA-disease pair based on their attributes. When calculating embeddings, we introduce the information of miRNAs. The results of experiments show that THGNCDA outperformed the SOTA methods. In addition, it can be observed that our method gives a better recall rate. To confirm the significance of attention, we conducted extensive ablation studies. Case studies on Urinary Bladder and Prostatic Neoplasms further show THGNCDA\'s ability in discovering known relationships between circRNA candidates and diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    多标签零拍学习(ZSL)比标准的单标签ZSL更合理,更真实,因为在真实场景中,多个对象可以共存于自然图像中。类内特征纠缠是影响视觉和语义特征对齐的重要因素,导致模型无法全面、完整地识别看不见的样本。我们观察到,现有的多标签ZSL方法更加强调基于注意力的细化和视觉特征的解耦,而忽略标签语义之间的关系。依靠标签相关性来解决多标签ZSL任务还没有得到深入研究。在本文中,充分利用类别标签之间的共生关系,构建基于统计和先验知识的有向加权语义图,其中节点特征表示类别语义,加权边表示标签共现的条件概率。为指导有针对性地提取视觉特征,节点特征和边集权重同时更新和细化,并从全局和局部的角度嵌入到视觉特征提取网络中。在两个具有挑战性的多标签ZSL基准:NUS-WIDE和OpenImages上的仿真结果证明了所提出方法的有效性。与最先进的模型相比,我们的模型在NUS-WIDE上实现了2.4%的mAP绝对增益,在OpenImages上实现了2.1%的mAP绝对增益。
    Multi-label Zero-shot Learning (ZSL) is more reasonable and realistic than standard single-label ZSL because several objects can co-exist in a natural image in real scenarios. Intra-class feature entanglement is a significant factor influencing the alignment of visual and semantic features, resulting in the model\'s inability to recognize unseen samples comprehensively and completely. We observe that existing multi-label ZSL methods place a greater emphasis on attention-based refinement and decoupling of visual features, while ignoring the relationship between label semantics. Relying on label correlations to solve multi-label ZSL tasks has not been deeply studied. In this paper, we make full use of the co-occurrence relationship between category labels and build a directed weighted semantic graph based on statistics and prior knowledge, in which node features represent category semantics and weighted edges represent conditional probabilities of label co-occurrence. To guide the targeted extraction of visual features, node features and edge set weights are simultaneously updated and refined, and embedded into the visual feature extraction network from a global and local perspective. The proposed method\'s effectiveness was demonstrated by simulation results on two challenging multi-label ZSL benchmarks: NUS-WIDE and Open Images. In comparison to state-of-the-art models, our model achieves an absolute gain of 2.4% mAP on NUS-WIDE and 2.1% mAP on Open Images respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在单细胞RNA测序(scRNA-seq)过程中形成的双体严重影响下游研究,如差异表达基因分析和细胞轨迹推断,并限制scRNA-seq的细胞通量。目前有几种双峰检测算法,但是由于缺乏具有合适模型体系结构的有效特征嵌入策略,因此它们的泛化性能可以进一步提高。因此,SoCube,一种新颖的深度学习算法,被开发用于精确检测各种类型的scRNA-seq数据中的双峰。SoCube(i)提出了一种新颖的3D复合特征嵌入策略,该策略嵌入了潜在的基因信息,并且(ii)构建了一个多内核,多通道CNN集成架构与特征嵌入策略相结合。凭借其在基准评估和多个下游任务方面的出色表现,它有望成为检测和去除scRNA-seq数据中双峰的强大算法。SoCube在Python官方软件包网站PyPi(https://pypi.org/project/socube/)和GitHub(https://github.com/idrblab/socube/)上作为端到端工具免费提供。
    Doublets formed during single-cell RNA sequencing (scRNA-seq) severely affect downstream studies, such as differentially expressed gene analysis and cell trajectory inference, and limit the cellular throughput of scRNA-seq. Several doublet detection algorithms are currently available, but their generalization performance could be further improved due to the lack of effective feature-embedding strategies with suitable model architectures. Therefore, SoCube, a novel deep learning algorithm, was developed to precisely detect doublets in various types of scRNA-seq data. SoCube (i) proposed a novel 3D composite feature-embedding strategy that embedded latent gene information and (ii) constructed a multikernel, multichannel CNN-ensembled architecture in conjunction with the feature-embedding strategy. With its excellent performance on benchmark evaluation and several downstream tasks, it is expected to be a powerful algorithm to detect and remove doublets in scRNA-seq data. SoCube is freely provided as an end-to-end tool on the Python official package site PyPi (https://pypi.org/project/socube/) and open-source on GitHub (https://github.com/idrblab/socube/).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在大多数工业应用的基础上,蛋白质溶解度是有效异源蛋白质表达的前提,也是基础研究中功能解释的前提。然而,包涵体的反复形成仍然是蛋白质科学和工业中不可避免的障碍,只有近四分之一的蛋白质可以以可溶性形式成功表达。尽管随着时间的推移已经开发了许多溶解度预测模型,在当前可用蛋白质序列强劲增加的背景下,它们的性能仍然不令人满意。因此,必须开发新颖且高度准确的预测因子,以使高可溶性蛋白质的优先顺序化,从而降低实际实验工作的成本。
    结果:在这项研究中,我们开发了一种新的工具,DeepSoluE,它使用具有由物理化学模式和氨基酸分布表示组成的混合特征的长短期记忆(LSTM)网络来预测蛋白质溶解度。比较结果表明,与现有工具相比,该模型实现了更准确和平衡的性能。此外,我们探索了对模型性能及其交互效应具有主要影响的特定特征。
    结论:DeepSoluE适用于预测大肠杆菌中的蛋白质溶解度;它作为一种生物信息学工具,用于预筛选潜在可溶性靶标,以降低湿法实验研究的成本。公开可用的网络服务器可在http://lab免费访问。malab.cn/~王超/softs/DeepSoluE/。
    Protein solubility is a precondition for efficient heterologous protein expression at the basis of most industrial applications and for functional interpretation in basic research. However, recurrent formation of inclusion bodies is still an inevitable roadblock in protein science and industry, where only nearly a quarter of proteins can be successfully expressed in soluble form. Despite numerous solubility prediction models having been developed over time, their performance remains unsatisfactory in the context of the current strong increase in available protein sequences. Hence, it is imperative to develop novel and highly accurate predictors that enable the prioritization of highly soluble proteins to reduce the cost of actual experimental work.
    In this study, we developed a novel tool, DeepSoluE, which predicts protein solubility using a long-short-term memory (LSTM) network with hybrid features composed of physicochemical patterns and distributed representation of amino acids. Comparison results showed that the proposed model achieved more accurate and balanced performance than existing tools. Furthermore, we explored specific features that have a dominant impact on the model performance as well as their interaction effects.
    DeepSoluE is suitable for the prediction of protein solubility in E. coli; it serves as a bioinformatics tool for prescreening of potentially soluble targets to reduce the cost of wet-experimental studies. The publicly available webserver is freely accessible at http://lab.malab.cn/~wangchao/softs/DeepSoluE/ .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    抗菌肽(AMP)是在生物体中产生的具有有效杀菌活性的碱性物质。作为抗生素的最佳替代品,它们在科学研究和临床应用中越来越受到重视。AMP可以从几乎所有的生物体中产生,并且能够杀死多种病原微生物。除了抗菌,天然AMP具有许多其他重要的治疗活性,比如伤口愈合,抗氧化和免疫调节作用。为了发现新的AMP,使用湿法实验方法昂贵且困难,而生物信息学技术可以有效地解决这一问题。最近,一些深度学习方法已应用于AMP的预测,取得了较好的效果。为了进一步提高AMP的预测精度,本文设计了一种基于序列多维表示的深度学习方法。通过编码和嵌入序列特征,然后输入模型来识别AMP,实现了长度为10-200的AMP和非AMP的高精度分类。结果表明,与独立数据验证中最先进的模型相比,我们的方法将精度提高了1.05%,而不降低其他指标。
    Antimicrobial peptides (AMPs) are alkaline substances with efficient bactericidal activity produced in living organisms. As the best substitute for antibiotics, they have been paid more and more attention in scientific research and clinical application. AMPs can be produced from almost all organisms and are capable of killing a wide variety of pathogenic microorganisms. In addition to being antibacterial, natural AMPs have many other therapeutically important activities, such as wound healing, antioxidant and immunomodulatory effects. To discover new AMPs, the use of wet experimental methods is expensive and difficult, and bioinformatics technology can effectively solve this problem. Recently, some deep learning methods have been applied to the prediction of AMPs and achieved good results. To further improve the prediction accuracy of AMPs, this paper designs a new deep learning method based on sequence multidimensional representation. By encoding and embedding sequence features, and then inputting the model to identify AMPs, high-precision classification of AMPs and Non-AMPs with lengths of 10-200 is achieved. The results show that our method improved accuracy by 1.05% compared to the most advanced model in independent data validation without decreasing other indicators.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:癌症是遗传异质性的,因此,由于患者的遗传特征不同,抗癌药物对患者显示出不同程度的有效性。了解患者对多种癌症药物的反应是癌症个性化治疗所必需的。通过使用癌细胞系百科全书(CCLE)中提供的癌细胞系的分子谱和癌症药物敏感性基因组学(GDSC)中提供的抗癌药物反应,我们将建立计算模型来从分子特征预测抗癌药物反应。
    结果:我们提出了一种新颖的深度神经网络模型,该模型集成了可用作基因表达的多组学数据,拷贝数变化,基因突变,反相蛋白质阵列表达式,和代谢组学表达,以预测细胞对已知抗癌药物的反应。我们采用了一种新颖的图形嵌入层,该层将相互作用组数据作为先验信息进行预测。此外,我们提出了一种新颖的注意力层,有效地结合了不同的组学特征,考虑到他们的互动。该网络优于前馈神经网络,并报告了从CCLE和GDSC中可用的癌细胞系数据预测药物反应的[公式:见文本]值的0.90。
    结论:我们的实验结果表明,所提出的方法能够捕获基因和蛋白质的相互作用,并有效地整合了多组学功能。此外,消融研究的结果和注意力层的调查都表明,基因突变比其他组学数据类型对药物反应的预测有更大的影响.因此,我们得出的结论是,我们的方法不仅可以准确预测抗癌药物的反应,而且还可以深入了解癌细胞系和药物的反应机制。
    BACKGROUND: Cancers are genetically heterogeneous, so anticancer drugs show varying degrees of effectiveness on patients due to their differing genetic profiles. Knowing patient\'s responses to numerous cancer drugs are needed for personalized treatment for cancer. By using molecular profiles of cancer cell lines available from Cancer Cell Line Encyclopedia (CCLE) and anticancer drug responses available in the Genomics of Drug Sensitivity in Cancer (GDSC), we will build computational models to predict anticancer drug responses from molecular features.
    RESULTS: We propose a novel deep neural network model that integrates multi-omics data available as gene expressions, copy number variations, gene mutations, reverse phase protein array expressions, and metabolomics expressions, in order to predict cellular responses to known anti-cancer drugs. We employ a novel graph embedding layer that incorporates interactome data as prior information for prediction. Moreover, we propose a novel attention layer that effectively combines different omics features, taking their interactions into account. The network outperformed feedforward neural networks and reported 0.90 for [Formula: see text] values for prediction of drug responses from cancer cell lines data available in CCLE and GDSC.
    CONCLUSIONS: The outstanding results of our experiments demonstrate that the proposed method is capable of capturing the interactions of genes and proteins, and integrating multi-omics features effectively. Furthermore, both the results of ablation studies and the investigations of the attention layer imply that gene mutation has a greater influence on the prediction of drug responses than other omics data types. Therefore, we conclude that our approach can not only predict the anti-cancer drug response precisely but also provides insights into reaction mechanisms of cancer cell lines and drugs as well.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号