关键词: Gleason grading WGCNA-based reconstruction cancer aggressiveness cancer differentiation consensus biomarker grade-salient gene machine learning prostate adenocarcinoma risk stratification trait-specific key gene

Mesh : Male Humans Prostate / pathology Consensus Prostatic Neoplasms / pathology Biomarkers Adenocarcinoma / genetics pathology Neoplasm Grading Muscle Proteins Intracellular Signaling Peptides and Proteins LIM Domain Proteins Shc Signaling Adaptor Proteins

来  源:   DOI:10.1177/15330338231222389   PDF(Pubmed)

Abstract:
Prostate adenocarcinoma (PRAD) is a common cancer diagnosis among men globally, yet large gaps in our knowledge persist with respect to the molecular bases of its progression and aggression. It is mostly indolent and slow-growing, but aggressive prostate cancers need to be recognized early for optimising treatment, with a view to reducing mortality.
Based on TCGA transcriptomic data pertaining to PRAD and the associated clinical metadata, we determined the sample Gleason grade, and used it to execute: (i) Gleason-grade wise linear modeling, followed by five contrasts against controls and ten contrasts between grades; and (ii) Gleason-grade wise network modeling via weighted gene correlation network analysis (WGCNA). Candidate biomarkers were obtained from the above analysis and the consensus found. The consensus biomarkers were used as the feature space to train ML models for classifying a sample as benign, indolent or aggressive.
The statistical modeling yielded 77 Gleason grade-salient genes while the WGCNA algorithm yielded 1003 trait-specific key genes in grade-wise significant modules. Consensus analysis of the two approaches identified two genes in Grade-1 (SLC43A1 and PHGR1), 26 genes in Grade-4 (including LOC100128675, PPP1R3C, NECAB1, UBXN10, SERPINA5, CLU, RASL12, DGKG, FHL1, NCAM1, and CEND1), and seven genes in Grade-5 (CBX2, DPYS, FAM72B, SHCBP1, TMEM132A, TPX2, UBE2C). A RandomForest model trained and optimized on these 35 biomarkers for the ternary classification problem yielded a balanced accuracy ∼ 86% on external validation.
The consensus of multiple parallel computational strategies has unmasked candidate Gleason grade-specific biomarkers. PRADclass, a validated AI model featurizing these biomarkers achieved good performance, and could be trialed to predict the differentiation of prostate cancers. PRADclass is available for academic use at: https://apalania.shinyapps.io/pradclass (online) and https://github.com/apalania/pradclass (command-line interface).
摘要:
背景:前列腺癌(PRAD)是全球男性中常见的癌症诊断,然而,就其发展和侵略的分子基础而言,我们的知识仍然存在巨大差距。它大多是惰性和缓慢生长的,但是侵袭性前列腺癌需要早期识别以优化治疗,以降低死亡率。
方法:基于与PRAD相关的TCGA转录组数据和相关的临床元数据,我们确定了样品的格里森等级,并用它来执行:(I)格里森等级线性建模,其次是对照的5个对比和等级之间的10个对比;和(ii)通过加权基因相关网络分析(WGCNA)的格里森等级明智网络建模。从上述分析中获得候选生物标志物并发现共识。共识生物标志物被用作特征空间来训练ML模型,用于将样本分类为良性,懒惰的或好斗的。
结果:统计建模产生了77个Gleason分级显著基因,而WGCNA算法在分级显著模块中产生了1003个性状特异性关键基因。两种方法的一致分析确定了1级中的两个基因(SLC43A1和PHGR1),4级基因26个(包括LOC100128675,PPP1R3C,NECAB1,UBXN10,SERPINA5,CLU,RASL12,DGKG,FHL1、NCAM1和CEND1),和7个基因在5级(CBX2,DPYS,FAM72B,SHCBP1,TMEM132A,TPX2,UBE2C)。RandomForest模型对这35种生物标志物进行了三元分类问题的训练和优化,在外部验证中获得了86%的平衡准确性。
结论:多个并行计算策略的共识具有未掩盖的候选Gleason等级特异性生物标志物。PRAD类,以这些生物标志物为特征的经过验证的AI模型取得了良好的性能,并可用于预测前列腺癌的分化。PRADclass可用于学术用途:https://apalania。shinyapps.io/pradclass(在线)和https://github.com/apalania/pradclass(命令行界面)。
公众号