关键词: Cox model cancer microRNA random survival forest model sequencing depth survival

Mesh : Humans Proportional Hazards Models Random Forest Gene Expression Profiling MicroRNAs / genetics Lung Neoplasms / genetics RNA, Messenger / genetics

来  源:   DOI:10.3390/genes13122275

Abstract:
(1) Background: tumor profiling enables patient survival prediction. The two essential parameters to be calibrated when designing a study based on tumor profiles from a cohort are the sequencing depth of RNA-seq technology and the number of patients. This calibration is carried out under cost constraints, and a compromise has to be found. In the context of survival data, the goal of this work is to benchmark the impact of the number of patients and of the sequencing depth of miRNA-seq and mRNA-seq on the predictive capabilities for both the Cox model with elastic net penalty and random survival forest. (2) Results: we first show that the Cox model and random survival forest provide comparable prediction capabilities, with significant differences for some cancers. Second, we demonstrate that miRNA and/or mRNA data improve prediction over clinical data alone. mRNA-seq data leads to slightly better prediction than miRNA-seq, with the notable exception of lung adenocarcinoma for which the tumor miRNA profile shows higher predictive power. Third, we demonstrate that the sequencing depth of RNA-seq data can be reduced for most of the investigated cancers without degrading the prediction abilities, allowing the creation of independent validation sets at a lower cost. Finally, we show that the number of patients in the training dataset can be reduced for the Cox model and random survival forest, allowing the use of different models on different patient subgroups.
摘要:
(1)背景:肿瘤分析可以预测患者的生存。在基于来自队列的肿瘤谱设计研究时要校准的两个基本参数是RNA-seq技术的测序深度和患者数量。这种校准是在成本限制下进行的,必须找到折衷方案。在生存数据的背景下,这项工作的目的是对患者数量以及miRNA-seq和mRNA-seq的测序深度对具有弹性净惩罚和随机生存森林的Cox模型的预测能力的影响进行基准分析.(2)结果:我们首先证明了Cox模型和随机生存森林提供了相当的预测能力,对某些癌症有显著差异。第二,我们证明miRNA和/或mRNA数据比单独的临床数据改善了预测。mRNA-seq数据导致比miRNA-seq略好的预测,值得注意的是,肺腺癌的肿瘤miRNA谱显示出较高的预测能力。第三,我们证明,对于大多数研究的癌症,RNA-seq数据的测序深度可以降低,而不会降低预测能力,允许以较低的成本创建独立的验证集。最后,我们表明,对于Cox模型和随机生存森林,可以减少训练数据集中的患者数量,允许对不同的患者亚组使用不同的模型。
公众号