关键词: ancestral recombination graph breakpoint localization informative sites phylogeny reconstruction recombination detection

Mesh : Algorithms Recombination, Genetic Phylogeny Chromosome Breakpoints Sequence Alignment / methods Models, Genetic

来  源:   DOI:10.1093/molbev/msae133   PDF(Pubmed)

Abstract:
Phylogenetic methods are widely used to reconstruct the evolutionary relationships among species and individuals. However, recombination can obscure ancestral relationships as individuals may inherit different regions of their genome from different ancestors. It is, therefore, often necessary to detect recombination events, locate recombination breakpoints, and select recombination-free alignments prior to reconstructing phylogenetic trees. While many earlier studies have examined the power of different methods to detect recombination, very few have examined the ability of these methods to accurately locate recombination breakpoints. In this study, we simulated genome sequences based on ancestral recombination graphs and explored the accuracy of three popular recombination detection methods: MaxChi, 3SEQ, and Genetic Algorithm Recombination Detection. The accuracy of inferred breakpoint locations was evaluated along with the key factors contributing to variation in accuracy across datasets. While many different genomic features contribute to the variation in performance across methods, the number of informative sites consistent with the pattern of inheritance between parent and recombinant child sequences always has the greatest contribution to accuracy. While partitioning sequence alignments based on identified recombination breakpoints can greatly decrease phylogenetic error, the quality of phylogenetic reconstructions depends very little on how breakpoints are chosen to partition the alignment. Our work sheds light on how different features of recombinant genomes affect the performance of recombination detection methods and suggests best practices for reconstructing phylogenies based on recombination-free alignments.
摘要:
系统发育方法被广泛用于重建物种之间和个体之间的进化关系。然而,重组可能会掩盖祖先的关系,因为个体可能会从不同的祖先继承其基因组的不同区域。因此,通常需要检测重组事件,在重建系统发育树之前,定位重组断点并选择无重组比对。虽然许多早期的研究检查了不同方法检测重组的能力,很少有人检查这些方法准确定位重组断点的能力。在这项研究中,我们基于祖先重组图模拟基因组序列,并探索了三种流行的重组检测方法的准确性:MaxChi,3SEQ和GARD。评估了推断的断点位置的准确性以及导致数据集准确性变化的关键因素。虽然许多不同的基因组特征导致不同方法的性能差异,与亲本和重组子序列之间的遗传模式一致的信息性位点的数量总是对准确性的贡献最大。虽然基于识别的重组断点的分配序列比对可以大大减少系统发育错误,系统发育重建的质量几乎不取决于如何选择断点来划分比对。我们的工作揭示了重组基因组的不同特征如何影响重组检测方法的性能,并提出了基于无重组比对重建系统发育的最佳实践。
公众号