关键词: consensus clustering coronavirus evolution phylogenomic networks

Mesh : Humans Phylogeny Pandemics Consensus Reproducibility of Results COVID-19

来  源:   DOI:10.1002/jmv.29233

Abstract:
The COVID-19 pandemic emphasizes the significance of studying coronaviruses (CoVs). This study investigates the evolutionary patterns of 350 CoVs using four structural proteins (S, E, M, and N) and introduces a consensus methodology to construct a comprehensive phylogenomic network. Our clustering of CoVs into 4 genera is consistent with the current CoV classification. Additionally, we calculate network centrality measures to identify CoV strains with significant average weighted degree and betweenness centrality values, with a specific focus on RaTG13 in the beta genus and NGA/A116E7/2006 in the gamma genus. We compare the phylogenetics of CoVs using our distance-based approach and the character-based model with IQ-TREE. Both methods yield largely consistent outcomes, indicating the reliability of our consensus approach. However, it is worth mentioning that our consensus method achieves an approximate 5000-fold increase in speed compared to IQ-TREE when analyzing the data set of 350 CoVs. This improved efficiency enhances the feasibility of conducting large-scale phylogenomic studies on CoVs.
摘要:
COVID-19大流行强调了研究冠状病毒(CoV)的重要性。本研究使用四种结构蛋白(S,E,M,和N),并介绍了一种共识方法来构建一个全面的系统基因组网络。我们将CoV聚类为4个属与当前的CoV分类一致。此外,我们计算网络中心性度量来识别具有显著平均加权度和介数中心性值的CoV毒株,特别关注β属中的RaTG13和γ属中的NGA/A116E7/2006。我们使用基于距离的方法和基于特征的模型与IQ-TREE比较了CoV的系统发育。这两种方法都产生了基本一致的结果,表明我们共识方法的可靠性。然而,值得一提的是,我们的共识方法在分析350个CoV的数据集时,与IQ-TREE相比,速度提高了约5000倍.这种效率的提高增强了对CoV进行大规模系统基因组研究的可行性。
公众号