关键词: diagnosis enrichment analysis lung cancer machine learning radiomics radiotranscriptomics statistical analysis transcriptomics

来  源:   DOI:10.3390/diagnostics13040738   PDF(Pubmed)

Abstract:
Radiotranscriptomics is an emerging field that aims to investigate the relationships between the radiomic features extracted from medical images and gene expression profiles that contribute in the diagnosis, treatment planning, and prognosis of cancer. This study proposes a methodological framework for the investigation of these associations with application on non-small-cell lung cancer (NSCLC). Six publicly available NSCLC datasets with transcriptomics data were used to derive and validate a transcriptomic signature for its ability to differentiate between cancer and non-malignant lung tissue. A publicly available dataset of 24 NSCLC-diagnosed patients, with both transcriptomic and imaging data, was used for the joint radiotranscriptomic analysis. For each patient, 749 Computed Tomography (CT) radiomic features were extracted and the corresponding transcriptomics data were provided through DNA microarrays. The radiomic features were clustered using the iterative K-means algorithm resulting in 77 homogeneous clusters, represented by meta-radiomic features. The most significant differentially expressed genes (DEGs) were selected by performing Significance Analysis of Microarrays (SAM) and 2-fold change. The interactions among the CT imaging features and the selected DEGs were investigated using SAM and a Spearman rank correlation test with a False Discovery Rate (FDR) of 5%, leading to the extraction of 73 DEGs significantly correlated with radiomic features. These genes were used to produce predictive models of the meta-radiomics features, defined as p-metaomics features, by performing Lasso regression. Of the 77 meta-radiomic features, 51 can be modeled in terms of the transcriptomic signature. These significant radiotranscriptomics relationships form a reliable basis to biologically justify the radiomics features extracted from anatomic imaging modalities. Thus, the biological value of these radiomic features was justified via enrichment analysis on their transcriptomics-based regression models, revealing closely associated biological processes and pathways. Overall, the proposed methodological framework provides joint radiotranscriptomics markers and models to support the connection and complementarities between the transcriptome and the phenotype in cancer, as demonstrated in the case of NSCLC.
摘要:
放射转录组学是一个新兴领域,旨在研究从医学图像中提取的放射组学特征与有助于诊断的基因表达谱之间的关系。治疗计划,和癌症的预后。这项研究提出了一个方法学框架,用于研究这些关联与非小细胞肺癌(NSCLC)的应用。使用具有转录组学数据的六个公开可用的NSCLC数据集来导出和验证转录组特征,以用于区分癌症和非恶性肺组织的能力。24名NSCLC诊断患者的公开数据集,转录组和成像数据,用于联合放射转录组学分析。对于每个病人来说,提取了749个计算机断层扫描(CT)的影像学特征,并通过DNA微阵列提供了相应的转录组学数据。使用迭代K均值算法对放射学特征进行聚类,得到77个同质聚类,以元放射学特征为代表。通过进行微阵列的显著性分析(SAM)和2倍变化来选择最显著的差异表达基因(DEGs)。使用SAM和Spearman等级相关检验研究了CT成像特征与所选DEG之间的相互作用,错误发现率(FDR)为5%,导致73个DEGs的提取与放射学特征显着相关。这些基因被用来产生meta-radiomics特征的预测模型,定义为p-元组学特征,通过执行Lasso回归。在77个元放射学特征中,图51可以根据转录组特征进行建模。这些重要的放射转录组学关系形成了可靠的基础,可以在生物学上证明从解剖成像方式中提取的放射组学特征。因此,通过对基于转录组学的回归模型进行富集分析,证明了这些放射学特征的生物学价值,揭示密切相关的生物过程和途径。总的来说,拟议的方法学框架提供了联合放射转录组学标记和模型,以支持转录组和癌症表型之间的联系和互补性,如在NSCLC的情况下所证明的。
公众号