关键词: CNV GWAS MOPline SV WGS imputation short read structural variant structural variation whole-genome sequencing

来  源:   DOI:10.1016/j.xgen.2023.100328   PDF(Pubmed)

Abstract:
Genomic structural variation (SV) affects genetic and phenotypic characteristics in diverse organisms, but the lack of reliable methods to detect SV has hindered genetic analysis. We developed a computational algorithm (MOPline) that includes missing call recovery combined with high-confidence SV call selection and genotyping using short-read whole-genome sequencing (WGS) data. Using 3,672 high-coverage WGS datasets, MOPline stably detected ∼16,000 SVs per individual, which is over ∼1.7-3.3-fold higher than previous large-scale projects while exhibiting a comparable level of statistical quality metrics. We imputed SVs from 181,622 Japanese individuals for 42 diseases and 60 quantitative traits. A genome-wide association study with the imputed SVs revealed 41 top-ranked or nearly top-ranked genome-wide significant SVs, including 8 exonic SVs with 5 novel associations and enriched mobile element insertions. This study demonstrates that short-read WGS data can be used to identify rare and common SVs associated with a variety of traits.
摘要:
基因组结构变异(SV)影响不同生物的遗传和表型特征,但是缺乏可靠的SV检测方法阻碍了遗传分析。我们开发了一种计算算法(MOPline),其中包括缺失呼叫恢复与高置信度SV呼叫选择和使用短读取全基因组测序(WGS)数据的基因分型相结合。使用3,672个高覆盖WGS数据集,MOPline稳定检测到~每个个体16,000SV,比以前的大型项目高出1.7-3.3倍,同时表现出可比的统计质量指标水平。我们从181,622名日本人的SVs中估算了42种疾病和60种数量性状。与估算的SV进行的全基因组关联研究显示,有41个排名最高或接近排名最高的全基因组重要SV,包括8个外显子SV,具有5个新颖的关联和丰富的移动元素插入。这项研究表明,短读WGS数据可用于鉴定与多种性状相关的罕见和常见SV。
公众号