关键词: deep learning algorithm paclitaxel single nucleotide polymorphisms

来  源:   DOI:10.1002/cai2.110   PDF(Pubmed)

Abstract:
UNASSIGNED: The rate at which the anticancer drug paclitaxel is cleared from the body markedly impacts its dosage and chemotherapy effectiveness. Importantly, paclitaxel clearance varies among individuals, primarily because of genetic polymorphisms. This metabolic variability arises from a nonlinear process that is influenced by multiple single nucleotide polymorphisms (SNPs). Conventional bioinformatics methods struggle to accurately analyze this complex process and, currently, there is no established efficient algorithm for investigating SNP interactions.
UNASSIGNED: We developed a novel machine-learning approach called GEP-CSIs data mining algorithm. This algorithm, an advanced version of GEP, uses linear algebra computations to handle discrete variables. The GEP-CSI algorithm calculates a fitness function score based on paclitaxel clearance data and genetic polymorphisms in patients with nonsmall cell lung cancer. The data were divided into a primary set and a validation set for the analysis.
UNASSIGNED: We identified and validated 1184 three-SNP combinations that had the highest fitness function values. Notably, SERPINA1, ATF3 and EGF were found to indirectly influence paclitaxel clearance by coordinating the activity of genes previously reported to be significant in paclitaxel clearance. Particularly intriguing was the discovery of a combination of three SNPs in genes FLT1, EGF and MUC16. These SNPs-related proteins were confirmed to interact with each other in the protein-protein interaction network, which formed the basis for further exploration of their functional roles and mechanisms.
UNASSIGNED: We successfully developed an effective deep-learning algorithm tailored for the nuanced mining of SNP interactions, leveraging data on paclitaxel clearance and individual genetic polymorphisms.
摘要:
抗癌药物紫杉醇从体内清除的速率显著影响其剂量和化疗有效性。重要的是,紫杉醇清除率因个体而异,主要是因为遗传多态性。这种代谢变异性源于受多个单核苷酸多态性(SNP)影响的非线性过程。传统的生物信息学方法很难准确地分析这个复杂的过程,目前,没有建立有效的算法来研究SNP相互作用。
我们开发了一种新的机器学习方法,称为GEP-CSIs数据挖掘算法。这个算法,GEP的高级版本,使用线性代数计算来处理离散变量。GEP-CSI算法根据非小细胞肺癌患者的紫杉醇清除率数据和遗传多态性计算适应度函数评分。将数据分为用于分析的主要集和验证集。
我们确定并验证了1184个具有最高适应度函数值的三SNP组合。值得注意的是,发现SERPINA1、ATF3和EGF通过协调先前报道的在紫杉醇清除中显著的基因的活性而间接影响紫杉醇清除。特别有趣的是在基因FLT1,EGF和MUC16中发现了三种SNP的组合。这些SNP相关蛋白被证实在蛋白质-蛋白质相互作用网络中相互作用,为进一步探索其功能作用和机制奠定了基础。
我们成功开发了一种有效的深度学习算法,专为SNP相互作用的细微差别挖掘而设计,利用紫杉醇清除率和个体遗传多态性的数据。
公众号