关键词: ABC Joint parameters estimation Kernel Mechanistic model Population genomics Summary statistics

来  源:   DOI:10.1016/j.jcmds.2024.100091   PDF(Pubmed)

Abstract:
Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD-based summary statistics perform well in estimating selection.
摘要:
在进化过程的大型模型中,参数的统计估计通常在计算上效率低下,无法使用精确的模型似然性来追求。即使使用单核苷酸多态性(SNP)数据,这提供了一种在保留相关信息的同时减少遗传数据大小的方法。执行关于大型模型的参数的统计推断的近似贝叶斯计算(ABC)利用模拟来绕过模型可能性的直接评估。我们开发了一个机械模型来模拟具有可变迁移率的时间前向发散选择,繁殖方式(性,无性),迁移选择周期的长度和数量。我们研究了ABC进行统计推断的计算可行性,并研究了选择中基因座位置的估计质量和选择强度。要展开选择下的位置的参数空间,我们通过对汇总的观测数据实施离群扫描来增强模型。我们评估了众所周知的汇总统计数据对捕捉选择强度的有用性,并在不同的选择下评估它们的信息量。我们还评估了遗传漂移相对于单基因座选择的理想化确定性模型的影响。我们讨论了重组率作为估计发散选择强度的混杂因素的作用,并强调其在打破连锁不平衡(LD)中的重要性。我们回答的问题是,在模型的参数空间的哪一部分中,我们恢复了用于估计选择的强信号,并确定基于人口差异的汇总统计或基于LD的汇总统计在估计选择方面是否表现良好。
公众号