关键词: GWAS IV Mendelian randomization TWAS coefficient of determination instrument variable

Mesh : Humans Polymorphism, Single Nucleotide Genome-Wide Association Study / methods Transcriptome / genetics Mendelian Randomization Analysis / methods Models, Genetic Cholesterol, LDL / genetics blood Phenotype

来  源:   DOI:10.1016/j.ajhg.2024.06.013   PDF(Pubmed)

Abstract:
In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger\'s method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R2, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger\'s method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).
摘要:
在孟德尔随机化中,已经开发了两种基于SNP-特质相关性的方法来推断暴露之间的因果方向(例如,基因)和结果(例如,一个特征),称为MRSteiger方法及其最近的扩展称为因果方向比(CD比)。在这里,我们提出了一种基于R2的方法,将来自多个(可能相关的)SNP的信息组合,以同时推断暴露与结果之间因果关系的存在和方向。我们提出的方法将Steiger的方法从使用单个SNP推广到多个SNP作为IV。它在转录组范围的关联研究(TWAS)(和类似的应用)中特别有用,通常具有小样本大小的基因表达(或其他分子性状)数据,提供一种更灵活、更有力的方法来推断因果方向。它可以应用于具有参考面板的GWAS汇总数据。我们还讨论了无效IV的影响,并引入了一种称为R2S的新方法来选择和删除无效IV(如果有)以增强鲁棒性。我们在仿真中比较了所提出方法与现有方法的性能,以证明其优势。我们使用个体水平的GTEx基因表达数据和UKBiobankGWAS数据应用了这些方法来鉴定高/低密度脂蛋白胆固醇(HDL/LDL)的因果基因。所提出的方法能够确认一些众所周知的因果基因,同时识别一些新的基因。此外,我们说明了所提出的方法在GWAS总结中的应用,以推断HDL/LDL与卒中/冠状动脉疾病(CAD)之间的因果关系.
公众号