关键词: Caprinae ERVs divergence pattern gene overlapping genome annotation

Mesh : Animals Sheep Endogenous Retroviruses / genetics RNA, Long Noncoding / genetics Evolution, Molecular Phylogeny Retroviridae Infections

来  源:   DOI:10.3390/v16030398   PDF(Pubmed)

Abstract:
The interest in endogenous retroviruses (ERVs) has been fueled by their impact on the evolution of the host genome. In this study, we used multiple pipelines to conduct a de novo exploration and annotation of ERVs in 13 species of the Caprinae subfamily. Through analyses of sequence identity, structural organization, and phylogeny, we defined 28 ERV groups within Caprinae, including 19 gamma retrovirus groups and 9 beta retrovirus groups. Notably, we identified four recent and potentially active groups prevalent in the Caprinae genomes. Additionally, our investigation revealed that most long noncoding genes (lncRNA) and protein-coding genes (PC) contain ERV-derived sequences. Specifically, we observed that ERV-derived sequences were present in approximately 75% of protein-coding genes and 81% of lncRNA genes in sheep. Similarly, in goats, ERV-derived sequences were found in approximately 74% of protein-coding genes and 75% of lncRNA genes. Our findings lead to the conclusion that the majority of ERVs in the Caprinae genomes can be categorized as fossils, representing remnants of past retroviral infections that have become permanently integrated into the genomes. Nevertheless, the identification of the Cap_ERV_20, Cap_ERV_21, Cap_ERV_24, and Cap_ERV_25 groups indicates the presence of relatively recent and potentially active ERVs in these genomes. These particular groups may contribute to the ongoing evolution of the Caprinae genome. The identification of putatively active ERVs in the Caprinae genomes raises the possibility of harnessing them for future genetic marker development.
摘要:
对内源性逆转录病毒(ERV)的兴趣已经被它们对宿主基因组进化的影响所激发。在这项研究中,我们使用多个管道对Caprinae亚科的13个物种中的ERV进行了从头探索和注释。通过序列同一性分析,结构组织,和系统发育,我们在Caprinae内定义了28个ERV组,包括19个γ逆转录病毒组和9个β逆转录病毒组。值得注意的是,我们确定了在Caprinae基因组中普遍存在的四个近期和潜在活跃的群体。此外,我们的调查显示,大多数长的非编码基因(lncRNA)和蛋白质编码基因(PC)含有ERV来源的序列.具体来说,我们观察到,在绵羊中,约75%的蛋白质编码基因和81%的lncRNA基因中存在ERV衍生序列.同样,在山羊中,在大约74%的蛋白质编码基因和75%的lncRNA基因中发现了ERV衍生序列。我们的发现得出的结论是,Caprinae基因组中的大多数ERV可以归类为化石,代表过去逆转录病毒感染的残留物,这些残留物已经永久整合到基因组中。然而,Cap_ERV_20,Cap_ERV_21,Cap_ERV_24和Cap_ERV_25组的鉴定表明在这些基因组中存在相对较新的和潜在活性的ERV。这些特定的群体可能有助于Caprinae基因组的持续进化。在Caprinae基因组中鉴定出推定活性的ERV,提高了利用它们进行未来遗传标记开发的可能性。
公众号