关键词: Caulimoviridae cotton endogenous form episomal form genome integration pararetrovirus virus

Mesh : Gossypium Phylogeny Computational Biology High-Throughput Nucleotide Sequencing Movement

来  源:   DOI:10.3390/v15081643   PDF(Pubmed)

Abstract:
Analyses of Illumina-based high-throughput sequencing data generated during characterization of the cotton leafroll dwarf virus population in Mississippi (2020-2022) consistently yielded contigs varying in size (most frequently from 4 to 7 kb) with identical nucleotide content and sharing similarities with reverse transcriptases (RTases) encoded by extant plant pararetroviruses (family Caulimoviridiae). Initial data prompted an in-depth study involving molecular and bioinformatic approaches to characterize the nature and origins of these caulimovirid-like sequences. As a result, here, we report on endogenous viral elements (EVEs) related to extant members of the family Caulimoviridae, integrated into a genome of upland cotton (Gossypium hirsutum), for which we propose the provisional name \"endogenous cotton pararetroviral elements\" (eCPRVE). Our investigations pinpointed a ~15 kbp-long locus on the A04 chromosome consisting of head-to-head orientated tandem copies located on positive- and negative-sense DNA strands (eCPRVE+ and eCPRVE-). Sequences of the eCPRVE+ comprised nearly complete and slightly decayed genome information, including ORFs coding for the viral movement protein (MP), coat protein (CP), RTase, and transactivator/viroplasm protein (TA). Phylogenetic analyses of major viral proteins suggest that the eCPRVE+ may have been initially derived from a genome of a cognate virus belonging to a putative new genus within the family. Unexpectedly, an identical 15 kb-long locus composed of two eCPRVE copies was also detected in a newly recognized species G. ekmanianum, shedding some light on the relatively recent evolution within the cotton family.
摘要:
在密西西比州(2020-2022年)的棉花卷叶矮病毒种群表征过程中产生的基于Illumina的高通量测序数据的分析始终产生大小不同的重叠群(最常见的是4至7kb),具有相同的核苷酸含量,并且与现有植物副反转录病毒(Caulimoviridiae家族)编码的逆转录酶(RTases)具有相似性。最初的数据促使人们进行了深入研究,涉及分子和生物信息学方法,以表征这些菜刀类序列的性质和起源。因此,在这里,我们报告了与Caulimoviridae家族现有成员相关的内源性病毒元件(EVE),整合到陆地棉(陆地棉)的基因组中,为此,我们提出了临时名称“内源性棉花病毒因子”(eCPRVE)。我们的研究在A04染色体上确定了一个约15kbp长的基因座,该基因座由位于正和负义DNA链(eCPRVE和eCPRVE-)上的头对头定向串联拷贝组成。eCPRVE+的序列包含几乎完整和轻微衰减的基因组信息,包括编码病毒运动蛋白(MP)的ORF,外壳蛋白(CP),RTase,和反式激活因子/病毒质蛋白(TA)。主要病毒蛋白的系统发育分析表明,eCPRVE最初可能源自属于该家族中推定的新属的同源病毒的基因组。出乎意料的是,在新识别的物种G.ekmanianum中还检测到由两个eCPRVE拷贝组成的相同的15kb长的基因座,揭示了棉花家族中相对较新的进化。
公众号