关键词: MASP T. cruzi antigenicity complex genomes copy number variation mucins multicopy genes transsialidases variability

Mesh : Humans Animals Mice Trypanosoma cruzi / genetics DNA Copy Number Variations Genome, Protozoan Mannose-Binding Protein-Associated Serine Proteases / genetics Multigene Family Chagas Disease / parasitology High-Throughput Nucleotide Sequencing / methods Mammals / genetics

来  源:   DOI:10.1128/mbio.02319-22

Abstract:
Repetitive elements cause assembly fragmentation in complex eukaryotic genomes, limiting the study of their variability. The genome of Trypanosoma cruzi, the parasite that causes Chagas disease, has a high repetitive content, including multigene families. Although many T. cruzi multigene families encode surface proteins that play pivotal roles in host-parasite interactions, their variability is currently underestimated, as their high repetitive content results in collapsed gene variants. To estimate sequence variability and copy number variation of multigene families, we developed a read-based approach that is independent of gene-specific read mapping and de novo assembly. This methodology was used to estimate the copy number and variability of MASP, TcMUC, and Trans-Sialidase (TS), the three largest T. cruzi multigene families, in 36 strains, including members of all six parasite discrete typing units (DTUs). We found that these three families present a specific pattern of variability and copy number among the distinct parasite DTUs. Inter-DTU hybrid strains presented a higher variability of these families, suggesting that maintaining a larger content of their members could be advantageous. In addition, in a chronic murine model and chronic Chagasic human patients, the immune response was focused on TS antigens, suggesting that targeting TS conserved sequences could be a potential avenue to improve diagnosis and vaccine design against Chagas disease. Finally, the proposed approach can be applied to study multicopy genes in any organism, opening new avenues to access sequence variability in complex genomes. IMPORTANCE Sequences that have several copies in a genome, such as multicopy-gene families, mobile elements, and microsatellites, are among the most challenging genomic segments to study. They are frequently underestimated in genome assemblies, hampering the correct assessment of these important players in genome evolution and adaptation. Here, we developed a new methodology to estimate variability and copy numbers of repetitive genomic regions and employed it to characterize the T. cruzi multigene families MASP, TcMUC, and transsialidase (TS), which are important virulence factors in this parasite. We showed that multigene families vary in sequence and content among the parasite\'s lineages, whereas hybrid strains have a higher sequence variability that could be advantageous to the parasite\'s survivability. By identifying conserved sequences within multigene families, we showed that the mammalian host immune response toward these multigene families is usually focused on the TS multigene family. These TS conserved and immunogenic peptides can be explored in future works as diagnostic targets or vaccine candidates for Chagas disease. Finally, this methodology can be easily applied to any organism of interest, which will aid in our understanding of complex genomic regions.
摘要:
重复元件导致复杂真核生物基因组中的组装片段,限制了对其变异性的研究。克氏锥虫的基因组,导致查加斯病的寄生虫,具有很高的重复内容,包括多基因家族。尽管许多克氏毛虫多基因家族编码在宿主-寄生虫相互作用中起关键作用的表面蛋白,它们的可变性目前被低估了,因为它们的高重复含量会导致基因变异崩溃。为了估计多基因家族的序列变异性和拷贝数变异,我们开发了一种基于读取的方法,该方法独立于基因特异性读取作图和从头组装。该方法用于估计MASP的拷贝数和变异性,TcMUC,和反唾液酸酶(TS),三个最大的T.Cruzi多基因家族,在36个菌株中,包括所有六个寄生虫离散分型单位(DTU)的成员。我们发现,这三个家族在不同的寄生虫DTU中呈现出特定的变异性和拷贝数模式。DTU间杂种菌株在这些家族中表现出更高的变异性,这表明保持其成员的更大内容可能是有利的。此外,在慢性小鼠模型和慢性Chagasic人类患者中,免疫反应集中在TS抗原上,提示靶向TS保守序列可能是改善针对查加斯病的诊断和疫苗设计的潜在途径。最后,所提出的方法可以应用于研究任何生物体中的多拷贝基因,为获取复杂基因组中的序列变异性开辟了新的途径。基因组中有几个拷贝的重要性序列,如多拷贝基因家族,移动元素,和微型卫星,是最具挑战性的基因组片段之一。它们在基因组组装中经常被低估,阻碍了对基因组进化和适应中这些重要参与者的正确评估。这里,我们开发了一种新的方法来估计重复基因组区域的变异性和拷贝数,并利用它来表征克鲁兹多基因家族MASP,TcMUC,和转唾液酸酶(TS),是这种寄生虫的重要毒力因子。我们表明,多基因家族在寄生虫谱系中的序列和含量不同,而杂种菌株具有更高的序列变异性,这可能有利于寄生虫的生存能力。通过鉴定多基因家族中的保守序列,我们表明,哺乳动物宿主对这些多基因家族的免疫反应通常集中在TS多基因家族上。这些TS保守和免疫原性肽可以在未来的工作中作为查加斯病的诊断靶标或疫苗候选物进行探索。最后,这种方法可以很容易地应用于任何感兴趣的有机体,这将有助于我们理解复杂的基因组区域。
公众号