关键词: Genomics Machine learning Precision medicine Random forest Variant classification

Mesh : Humans Genetic Variation Databases, Genetic Time Factors Reproducibility of Results Genomics / methods

来  源:   DOI:10.1186/s12967-024-05508-w   PDF(Pubmed)

Abstract:
BACKGROUND: Interpreting the clinical consequences of genetic variants is the central problem in modern clinical genomics, for both hereditary diseases and oncology. However, clinical validation lags behind the pace of discovery, leading to distressing uncertainty for patients, physicians and researchers. This \"interpretation gap\" changes over time as evidence accumulates, and variants initially deemed of uncertain (VUS) significance may be subsequently reclassified in pathogenic/benign. We previously developed RENOVO, a random forest-based tool able to predict variant pathogenicity based on publicly available information from GnomAD and dbNFSP, and tested on variants that have changed their classification status over time. Here, we comprehensively evaluated the accuracy of RENOVO predictions on variants that have been reclassified over the last four years.
METHODS: we retrieved 16 retrospective instances of the ClinVar database, every 3 months since March 2020 to March 2024, and analyzed time trends of variant classifications. We identified variants that changed their status over time and compared RENOVO predictions generated in 2020 with the actual reclassifications.
RESULTS: VUS have become the most represented class in ClinVar (44.97% vs. 9.75% (likely) pathogenic and 40,33% (likely) benign). The rate of VUS reclassification is linear and slow compared to the rate of VUS reporting, exponential and currently ~ 30x faster, creating a growing divide between what can be sequenced vs. what can be interpreted. Out of 10,196 VUS variants in January 2020 that have undergone a clinically meaningful reclassification to march 2024, RENOVO correctly classified 82.6% in 2020. In addition, RENOVO correctly identified the majority of the few variants that switched clinically meaningful classes (e.g., from benign to pathogenic and vice versa). We highlight variant classes and clinically relevant genes for which RENOVO provides particularly accurate estimates. In particularly, genes characterized by large prevalence of high- or low-impact variants (e.g., POLE, NOTCH1, FANCM etc.). Suboptimal RENOVO predictions mostly concern genes validated through dedicated consortia (e.g., BRCA1/2), in which RENOVO would anyway have a limited impact.
CONCLUSIONS: Time trend analysis demonstrates that the current model of variant interpretation cannot keep up with variant discovery. Machine learning-based tools like RENOVO confirm high accuracy that can aid in clinical practice and research.
摘要:
背景:解释遗传变异的临床后果是现代临床基因组学的核心问题,遗传性疾病和肿瘤学.然而,临床验证落后于发现的步伐,导致患者痛苦的不确定性,医生和研究人员。随着证据的积累,这种“解释差距”会随着时间的推移而变化,最初被认为具有不确定(VUS)意义的变体随后可能会被重新分类为致病性/良性。我们之前开发了RENOVO,一种基于随机森林的工具,能够根据来自GnomAD和dbNFSP的公开信息预测变异致病性,并在随时间改变其分类状态的变体上进行测试。这里,我们全面评估了RENOVO对过去4年重新分类的变异体预测的准确性.
方法:我们检索了ClinVar数据库的16个回顾性实例,自2020年3月至2024年3月,每3个月进行一次,并分析变体分类的时间趋势。我们确定了随着时间的推移而改变其状态的变体,并将2020年产生的RENOVO预测与实际的重新分类进行了比较。
结果:VUS已成为ClinVar中最具代表性的类别(44.97%与9.75%(可能)致病性和40,33%(可能)良性)。与VUS报告的速率相比,VUS重新分类的速率是线性且缓慢的。指数,目前快~30倍,在可以测序的内容与什么可以解释。在2020年1月的10,196个VUS变体中,到2024年3月进行了有临床意义的重新分类,RENOVO在2020年正确分类了82.6%。此外,RENOVO正确地鉴定了转换为临床意义类别的少数变体中的大多数(例如,从良性到致病性,反之亦然)。我们重点介绍了RENOVO提供特别准确估计的变体类别和临床相关基因。特别是,以高或低影响变异的大流行为特征的基因(例如,POLE,NOTCH1、FANCM等.).次优RENOVO预测主要涉及通过专用联盟验证的基因(例如,BRCA1/2),RENOVO无论如何都会产生有限的影响。
结论:时间趋势分析表明,当前的变体解释模型无法跟上变体发现的步伐。RENOVO等基于机器学习的工具证实了高准确性,可以帮助临床实践和研究。
公众号