关键词: Genetic diseases Machine Learning Oligogenic Prioritization Whole-Exome Sequencing

Mesh : Humans Exome Exome Sequencing / methods Genetic Variation High-Throughput Nucleotide Sequencing / methods Computational Biology / methods

来  源:   DOI:10.1093/bioinformatics/btae184   PDF(Pubmed)

Abstract:
BACKGROUND: Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation. However, properly identifying which variants are causative of a genetic disease remains an important challenge, often due to the number of variants that need to be screened. Expanding the screening to combinations of variants in two or more genes, as would be required under the oligogenic inheritance model, simply blows this problem out of proportion.
RESULTS: We present here the High-throughput oligogenic prioritizer (Hop), a novel prioritization method that uses direct oligogenic information at the variant, gene and gene pair level to detect digenic variant combinations in WES data. This method leverages information from a knowledge graph, together with specialized pathogenicity predictions in order to effectively rank variant combinations based on how likely they are to explain the patient\'s phenotype. The performance of Hop is evaluated in cross-validation on 36 120 synthetic exomes for training and 14 280 additional synthetic exomes for independent testing. Whereas the known pathogenic variant combinations are found in the top 20 in approximately 60% of the cross-validation exomes, 71% are found in the same ranking range when considering the independent set. These results provide a significant improvement over alternative approaches that depend simply on a monogenic assessment of pathogenicity, including early attempts for digenic ranking using monogenic pathogenicity scores.
METHODS: Hop is available at https://github.com/oligogenic/HOP.
摘要:
背景:全外显子组测序(WES)已成为遗传研究的强大工具,能够收集大量关于人类遗传变异的数据。然而,正确识别哪些变异是遗传疾病的原因仍然是一个重要的挑战,通常是由于需要筛选的变体数量。将筛选扩展到两个或多个基因的变体组合,根据寡代遗传模型的要求,只是把这个问题夸大了。
结果:我们在这里介绍了高通量寡基因优先排序器(Hop),一种新颖的优先排序方法,在变体处使用直接的寡原信息,基因和基因对水平,以检测WES数据中的双基因变异组合。此方法利用知识图中的信息,以及专门的致病性预测,以便根据解释患者表型的可能性对变体组合进行有效排名。在用于训练的36120个合成外显子组和用于独立测试的14280个额外的合成外显子的交叉验证中评估Hop的性能。而已知的致病变体组合在大约60%的交叉验证外显子组中的前20位发现,当考虑独立集时,发现71%在相同的排名范围内。这些结果提供了一个显著的改进替代方法,仅仅依赖于致病性的单基因评估,包括使用单基因致病性评分进行双基因排序的早期尝试。
背景:Hop可在https://github.com/oligogenic/HOP获得。
背景:补充数据可在Bioinformatics在线获得。
公众号