关键词: ACMG/AMP cis-regulatory genetics genome genomic variant mutation non-coding promoter sequence analysis variant classification

Mesh : Humans Computational Biology / methods Regulatory Sequences, Nucleic Acid / genetics Genetic Diseases, Inborn / genetics classification Genetic Variation Calibration Genetic Testing / methods

来  源:   DOI:10.1016/j.ajhg.2024.05.002   PDF(Pubmed)

Abstract:
To date, clinical genetic testing for Mendelian disease variants has focused heavily on exonic coding and intronic gene regions. This multi-step study was undertaken to provide an evidence base for selecting and applying computational approaches for use in clinical classification of 5\' cis-regulatory region variants. Curated datasets of clinically reported disease-causing 5\' cis-regulatory region variants and variants from matched genomic regions in population controls were used to calibrate six bioinformatic tools as predictors of variant pathogenicity. Likelihood ratio estimates were aligned to code weights following ClinGen recommendations for application of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (ACMG/AMP) classification scheme. Considering code assignment across all reference dataset variants, performance was best for CADD (81.2%) and REMM (81.5%). Optimized thresholds provided moderate evidence toward pathogenicity (CADD, REMM) and moderate (CADD) or supporting (REMM) evidence against pathogenicity. Both sensitivity and specificity of prediction were improved when further categorizing variants based on location in an EPDnew-defined promoter region. Combining predictions (CADD, REMM, and location in a promoter region) increased specificity at the expense of sensitivity. Importantly, the optimal CADD thresholds for assigning ACMG/AMP codes PP3 (≥10) and BP4 (≤8) were vastly different from recommendations for protein-coding variants (PP3 ≥25.3; BP4 ≤22.7); CADD <22.7 would incorrectly assign BP4 for >90% of reported disease-causing cis-regulatory region variants. Our results demonstrate the need to consider a tiered approach and tailored score thresholds to optimize bioinformatic impact prediction for clinical classification of 5\' cis-regulatory region variants.
摘要:
迄今为止,孟德尔疾病变异的临床基因检测主要集中在外显子编码和内含子基因区域.这项多步骤研究旨在为选择和应用用于5个顺式调控区变异的临床分类的计算方法提供证据基础。在人群对照中,临床报告的致病5'顺式调控区变异和来自匹配基因组区域的变异的数据集被用来校准六个生物信息学工具作为变异致病性的预测因子。根据ClinGen建议应用美国医学遗传学和基因组学学院和分子病理学协会(ACMG/AMP)分类方案,将似然比估计值与代码权重对齐。考虑到所有参考数据集变体的代码分配,CADD(81.2%)和REMM(81.5%)的性能最好。优化的阈值为致病性提供了适度的证据(CADD,REMM)和针对致病性的中度(CADD)或支持(REMM)证据。当基于EPDnew定义的启动子区域中的位置对变体进行进一步分类时,预测的灵敏度和特异性都得到改善。结合预测(CADD,REMM,和启动子区域中的位置)以灵敏度为代价增加了特异性。重要的是,分配ACMG/AMP编码PP3(≥10)和BP4(≤8)的最佳CADD阈值与蛋白质编码变异体的建议(PP3≥25.3;BP4≤22.7)有很大不同;CADD<22.7会错误地将>90%的报告的致病顺式调控区变异体分配给BP4.我们的结果表明,有必要考虑分层方法和量身定制的评分阈值,以优化5个顺式调控区变异的临床分类的生物信息学影响预测。
公众号