背景:微卫星或简单序列重复(SSR)由DNA或RNA的1-6个核苷酸基序组成,这些基序普遍存在于病毒基因组的串联重复序列中:原核生物和真核生物。它们可以定位于编码区和非编码区。SSR在复制中起着重要的作用,基因调控,转录,和蛋白质功能。杯状病毒科(CLV)病毒家族具有ss-RNA,无包裹,二十面体对称直径27-35nm。基因组的大小在6.4和8.6kb之间。
结果:发病率,composition,多样性,复杂性,系统地分析了62个卡利病毒科代表中不同微卫星的寄主范围。从NCBI评估全长基因组序列(https://www.ncbi.nlm.nih.gov),通过MISA软件提取微卫星。平均基因组大小约为7538bp,范围从6273(CLV61)到8798(CLV47)bp。基因组的平均GC含量为〜51%。在所研究的基因组中总共有1317个SSR和53个cSSR。CLV41和CLV49包含SSR的最高值和最低值,分别为32和10,而CLV16的最大cSSR发生率为4。有29个物种不含任何cSSR。单的发生率,di-,三核苷酸SSR分别为219、884和206。最普遍的单-,di-,三核苷酸重复基序为“C”(126个SSR),AC/CA(240SSR),和TGA/ACT(23SSR),分别。大多数SSR和cSSR偏向编码区,在基因组编码区中至少有约90%的入射SSR。在系统发育树中发现具有相似宿主的病毒彼此靠近,这表明病毒宿主是其进化的驱动力之一。
结论:Caliciviridae基因组在发病率方面不符合SSR特征的任何模式,composition,和本地化。SSR的这种独特性质在病毒进化中起着重要作用。系统发育树中相似宿主的聚类是SSR签名唯一性的证据。
BACKGROUND: Microsatellites or simple sequence repeats (SSR) consist of 1-6 nucleotide motifs of DNA or RNA which are ubiquitously present in tandem repeated sequences across genome in viruses: prokaryotes and eukaryotes. They may be localized to both the coding and non-coding regions. SSRs play an important role in replication, gene regulation, transcription, and protein function. The
Caliciviridae (CLV) family of viruses have ss-RNA, non-enveloped, icosahedral symmetry 27-35 nm in diameter in size. The size of the genome lies between 6.4 and 8.6 kb.
RESULTS: The incidence, composition, diversity, complexity, and host range of different microsatellites in 62 representatives of the family of
Caliciviridae were systematically analyzed. The full-length genome sequences were assessed from NCBI ( https://www.ncbi.nlm.nih.gov ), and microsatellites were extracted through MISA software. The average genome size is about 7538 bp ranging from 6273 (CLV61) to 8798 (CLV47) bp. The average GC content of the genomes was ~ 51%. There are a total of 1317 SSRs and 53 cSSRs in the studied genomes. CLV 41 and CLV 49 contain the highest and lowest value of SSRs with 32 and 10 respectively, while CLV16 had maximum cSSR incidence of 4. There were 29 species which do not contain any cSSR. The incidence of mono-, di-, and tri-nucleotide SSRs was 219, 884, and 206, respectively. The most prevalent mono-, di-, and tri-nucleotide repeat motifs were \"C\" (126 SSRs), AC/CA (240 SSRs), and TGA/ACT (23 SSRs), respectively. Most of the SSRs and cSSRs are biased toward the coding region with a minimum of ~ 90% incident SSRs in the genomes\' coding region. Viruses with similar host are found close to each other on the phylogenetic tree suggesting virus host being one of the driving forces for their evolution.
CONCLUSIONS: The
Caliciviridae genomes does not conform to any pattern of SSR signature in terms of incidence, composition, and localization. This unique property of SSR plays an important role in viral evolution. Clustering of similar host in the phylogenetic tree is the evidence of the uniqueness of SSR signature.