■头颈部鳞状细胞癌(HNSCC)是全球第七大高度流行的癌症类型。HNSCC的早期检测是管理癌症患者治疗的重要挑战之一。用于检测HNSCC的现有技术是昂贵的,贵,和侵入性的性质。
■在这项研究中,我们旨在通过使用机器学习和深度学习技术开发分类模型来解决这个问题,专注于单细胞转录组学,以区分HNSCC和正常样品。此外,我们建立了模型,将HNSCC样本分为HPV阳性(HPV+)和HPV阴性(HPV-)两类.在这项研究中,我们使用了GSE181919数据集,我们提取了20个原发癌(HNSCC)样本,和9个正常组织样本。原发癌样品含有13个HPV-和7个HPV+样品。在这项研究中开发的模型已经在80%的数据集上进行了训练,并在剩余的20%上进行了验证。为了开发一个有效的模型,我们使用mRMR方法进行特征选择,从大量基因中筛选出少量基因.我们还对100个入围基因进行了基因本体论(GO)富集分析。
■在100个基因上训练的基于人工神经网络的模型优于其他分类器,对于验证集的HNSCC分类,其AUROC为0.91。对于验证集上的HPV+和HPV-患者的分类,相同的算法实现了0.83的AUROC。在GO富集分析中,发现大多数基因参与结合和催化活性。
■已在Python中开发了一个软件包,该软件包允许用户识别患者的HNSCC及其HPV状态。它可以在https://web上获得。iitd.edu.in/raghava/hnscpred/.
UNASSIGNED: Head and Neck Squamous Cell Carcinoma (HNSCC) is the seventh most highly prevalent cancer type worldwide. Early detection of HNSCC is one of the important challenges in managing the treatment of the cancer patients. Existing techniques for detecting HNSCC are costly, expensive, and invasive in nature.
UNASSIGNED: In this study, we aimed to address this issue by developing classification models using machine learning and deep learning techniques, focusing on single-cell transcriptomics to distinguish between HNSCC and normal samples. Furthermore, we built models to classify HNSCC samples into HPV-positive (HPV+) and HPV-negative (HPV-) categories. In this study, we have used GSE181919 dataset, we have extracted 20 primary cancer (HNSCC) samples, and 9 normal tissues samples. The primary cancer samples contained 13 HPV- and 7 HPV+ samples. The models developed in this study have been trained on 80% of the dataset and validated on the remaining 20%. To develop an efficient model, we performed feature selection using mRMR method to shortlist a small number of genes from a plethora of genes. We also performed Gene Ontology (GO) enrichment analysis on the 100 shortlisted genes.
UNASSIGNED: Artificial Neural Network based model trained on 100 genes outperformed the other classifiers with an AUROC of 0.91 for HNSCC classification for the validation set. The same algorithm achieved an AUROC of 0.83 for the classification of HPV+ and HPV- patients on the validation set. In GO enrichment analysis, it was found that most genes were involved in binding and catalytic activities.
UNASSIGNED: A software package has been developed in Python which allows users to identify HNSCC in patients along with their HPV status. It is available at https://webs.iiitd.edu.in/raghava/hnscpred/.