Mesh : Algorithms Alleles Computer Simulation Crohn Disease / genetics metabolism pathology Gene Frequency Gene-Environment Interaction Genome, Human Genotype Germany Humans Linkage Disequilibrium Models, Genetic Polymorphism, Single Nucleotide

来  源:   DOI:10.1007/s00439-021-02294-z   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
Case-only (CO) studies are a powerful means to uncover gene-environment (G × E) interactions for complex human diseases. Moreover, such studies may in principle also draw upon genotype imputation to increase statistical power even further. However, genotype imputation usually employs healthy controls such as the Haplotype Reference Consortium (HRC) data as an imputation base, which may systematically perturb CO studies in genomic regions with main effects upon disease risk. Using genotype data from 719 German Crohn Disease (CD) patients, we investigated the level of imputation accuracy achievable for single nucleotide polymorphisms (SNPs) with or without a genetic main effect, and with varying minor allele frequency (MAF). Genotypes were imputed from neighbouring SNPs at different levels of linkage disequilibrium (LD) to the target SNP using the HRC data as an imputation base. Comparison of the true and imputed genotypes revealed lower imputation accuracy for SNPs with strong main effects. We also simulated different levels of G × E interaction to evaluate the potential loss of statistical validity and power incurred by the use of imputed genotypes. Simulations under the null hypothesis revealed that genotype imputation does not inflate the type I error rate of CO studies of G × E. However, the statistical power was found to be reduced by imputation, particularly for SNPs with low MAF, and a gradual loss of statistical power resulted when the level of LD to the SNPs driving the imputation decreased. Our study thus highlights that genotype imputation should be employed with great care in CO studies of G × E interaction.
摘要:
仅病例(CO)研究是揭示复杂人类疾病的基因-环境(G×E)相互作用的有力手段。此外,原则上,此类研究还可以利用基因型插补来进一步提高统计能力。然而,基因型归因通常采用健康对照,如单倍型参考联盟(HRC)数据作为归因基础,这可能会系统地扰乱基因组区域的CO研究,主要影响疾病风险。使用来自719名德国克罗恩病(CD)患者的基因型数据,我们调查了具有或不具有遗传主要效应的单核苷酸多态性(SNP)可实现的填补准确性水平,并且具有变化的次要等位基因频率(MAF)。使用HRC数据作为归因基础,从不同水平的连锁不平衡(LD)的相邻SNP估算基因型到目标SNP。真实基因型和估算基因型的比较显示,SNP的归因准确性较低,具有很强的主要作用。我们还模拟了不同水平的G×E相互作用,以评估由于使用估算基因型而导致的统计有效性和功效的潜在损失。零假设下的模拟表明,基因型归因不会增加G×E的CO研究的I型错误率。但是,发现统计能力因估算而降低,特别是对于具有低MAF的SNP,当驱动归因的SNP的LD水平降低时,统计能力逐渐丧失。因此,我们的研究强调,在G×E相互作用的CO研究中应格外小心地使用基因型归因。
公众号