Recombination hotspot

重组热点
  • 文章类型: Journal Article
    减数分裂重组在遗传进化中起着关键作用。重组引起的遗传变异是生物多样性产生的关键因素,也是进化的驱动力。目前,重组热点预测方法的发展遇到了特征提取不足和泛化能力有限的挑战。本文重点研究了重组热点预测方法。我们探索了基于深度学习的重组热点预测,并研究了流行模型在解决重组热点预测挑战方面的不足。为了解决这些缺陷,利用自动化机器学习方法构建重组热点预测模型。该模型通过使用TF-IDF-Kmer和DNA组成成分将序列信息与物理化学性质相结合,以获取更有效的特征数据。实验结果验证了本研究中使用的特征提取方法和自动机器学习技术的有效性。最终模型在三个不同的数据集上进行了验证,准确率为97.14%,79.71%,98.73%,超过目前领先车型2%,2.56%,4%,分别。此外,我们结合了SHAP和AutoGluon等工具来分析黑盒模型的可解释性,深入研究了单个特征对结果的影响,并调查了样本分类错误背后的原因。最后,建立了重组热点预测网站,以方便研究人员访问必要的信息和工具。本文的研究成果强调了自动化机器学习方法在基因序列预测中的巨大潜力。
    Meiotic recombination plays a pivotal role in genetic evolution. Genetic variation induced by recombination is a crucial factor in generating biodiversity and a driving force for evolution. At present, the development of recombination hotspot prediction methods has encountered challenges related to insufficient feature extraction and limited generalization capabilities. This paper focused on the research of recombination hotspot prediction methods. We explored deep learning-based recombination hotspot prediction and scrutinized the shortcomings of prevalent models in addressing the challenge of recombination hotspot prediction. To addressing these deficiencies, an automated machine learning approach was utilized to construct recombination hotspot prediction model. The model combined sequence information with physicochemical properties by employing TF-IDF-Kmer and DNA composition components to acquire more effective feature data. Experimental results validate the effectiveness of the feature extraction method and automated machine learning technology used in this study. The final model was validated on three distinct datasets and yielded accuracy rates of 97.14%, 79.71%, and 98.73%, surpassing the current leading models by 2%, 2.56%, and 4%, respectively. In addition, we incorporated tools such as SHAP and AutoGluon to analyze the interpretability of black-box models, delved into the impact of individual features on the results, and investigated the reasons behind misclassification of samples. Finally, an application of recombination hotspot prediction was established to facilitate easy access to necessary information and tools for researchers. The research outcomes of this paper underscore the enormous potential of automated machine learning methods in gene sequence prediction.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    万古霉素经常被用作抵抗多药耐药金黄色葡萄球菌感染的最后一道防线。最近的发现描述了通过将含有vanA操纵子的肠球菌质粒经由涉及称为基因座L2的特定整合位点的同源重组(HR)整合到金黄色葡萄球菌的染色体中来获得万古霉素抗性金黄色葡萄球菌(VRSA)菌株。为了描述获得vanA的所有机制,我们分析了寻找vanA的金黄色葡萄球菌的15,706个基因组,并描述了其遗传环境。我们在从12名患者中分离出的25个金黄色葡萄球菌菌株中发现了完整的vanA操纵子,其中9个与VRE菌株共分离。在转座子Tn1546样元件中发现了VanA,在十七个质粒和八个染色体上。VanA可能是通过肠球菌和葡萄球菌质粒的结合获得的,携带vanA的Tn1546转座和质粒整合到染色体中。我们在不同大陆的2,087个金黄色葡萄球菌菌株基因组(13.3%)中检测到L2,并确定了六个潜在的染色体热点,用于通过HR通过L2整合整个含vanA的肠球菌质粒。这表明最近在纽约患者中描述的情况可以在任何地方再现。监视这种可能性是强制性的,特别是在VRSA和VRE感染或定植的患者中。
    BACKGROUND: Vancomycin is frequently used as a last line of defence against infections due to multidrug-resistant Staphylococcus aureus (S. aureus). A recent finding described the acquisition of vancomycin-resistant S. aureus strains by the integration of an enterococcal plasmid containing the vanA operon into the S. aureus chromosome via homologous recombination involving a specific integration site called locus L2.
    METHODS: To characterise all mechanisms of acquisition of vanA, this study analysed the 15 706 S. aureus genomes to look for vanA and described its genetic environment.
    RESULTS: A complete vanA operon was found in 25 S. aureus strains isolated from 12 patients, including nine co-isolated with vancomycin-resistant Enterococcus strains. VanA was found within transposon Tn1546-like elements on 17 plasmids and eight chromosomes. VanA might be acquired through conjugation of enterococcal and staphylococcal plasmids, transposition of Tn1546 carrying vanA and plasmid integration into the chromosome. Further, L2 was detected in 2087 genomes (13.3%) of S. aureus strains across different continents. Six potential chromosomal hotspots for integration of the entire vanA-containing enterococcal plasmid were identified by homologous recombination via L2.
    CONCLUSIONS: These findings suggest that the recently described scenario in a New York patient could be reproduced anywhere. Surveillance of this possibility is mandatory, especially in patients with vancomycin-resistant Enterococcus infection or colonisation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:GGC和GCC短串联重复序列(STR)具有多种进化特征,生物,和病理意义。然而,这些STR的基本双重复(dyad)尚未被广泛探索。
    结果:在全基因组范围内,我们在人类中绘制了(GGC)2和(GCC)2二元组,并发现巨大的菌落(每个二元之间的距离<500bp)具有非凡的密度,在某些情况下是周期性的。最大的(GCC)2和(GGC)2菌落是基因间的,同质,和特定于人类的,由2号染色体上的219(GCC)2(概率<1.545E-219)和9号染色体上的70(GGC)2(概率=1.809E-148)组成。我们还发现其他类人猿有几个殖民地,人类的密度和复杂性有方向增加,例如20号染色体上的99(GCC)2菌落,在大猿中特异性扩增,并达到人类的最大复杂度(概率1.545E-220)。在基因组的其他大部分被忽视的区域中检测到许多与人类进化相关的殖民地,例如Y染色体和假基因。含有或最接近这些菌落的几个基因在人类中分散表达。
    结论:结论:(GCC)2和(GGC)2形成了前所未有的基因组菌落,与人类和其他类人猿的进化相吻合。导致这些菌落的基因组重排程度支持被忽视的重组热点,在大猿之间分享。确定的殖民地值得在机械上进行研究,进化,和功能平台。
    BACKGROUND: GGC and GCC short tandem repeats (STRs) are of various evolutionary, biological, and pathological implications. However, the fundamental two-repeats (dyads) of these STRs are widely unexplored.
    RESULTS: On a genome-wide scale, we mapped (GGC)2 and (GCC)2 dyads in human, and found monumental colonies (distance between each dyad < 500 bp) of extraordinary density, and in some instances periodicity. The largest (GCC)2 and (GGC)2 colonies were intergenic, homogeneous, and human-specific, consisting of 219 (GCC)2 on chromosome 2 (probability < 1.545E-219) and 70 (GGC)2 on chromosome 9 (probability = 1.809E-148). We also found that several colonies were shared in other great apes, and directionally increased in density and complexity in human, such as a colony of 99 (GCC)2 on chromosome 20, that specifically expanded in great apes, and reached maximum complexity in human (probability 1.545E-220). Numerous other colonies of evolutionary relevance in human were detected in other largely overlooked regions of the genome, such as chromosome Y and pseudogenes. Several of the genes containing or nearest to those colonies were divergently expressed in human.
    CONCLUSIONS: In conclusion, (GCC)2 and (GGC)2 form unprecedented genomic colonies that coincide with the evolution of human and other great apes. The extent of the genomic rearrangements leading to those colonies support overlooked recombination hotspots, shared across great apes. The identified colonies deserve to be studied in mechanistic, evolutionary, and functional platforms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    减数分裂重组是重要的进化力量和重要的减数分裂过程。在许多物种中,重组事件集中到由PRMD9的位点特异性结合定义的热点中。Prdm9锌指DNA结合阵列的快速进化导致物种间热点基因组分布的显著突变,但是关于Prdm9等位基因变异如何塑造种群之间重组格局的问题仍然知之甚少。野生家鼠(Musmusculus)拥有特殊的Prdm9多样性,迄今为止鉴定出>150个等位基因,并为解决这个悬而未决的问题提供了一个特别强大的系统。我们采用了基于合并的方法,从9个地理上孤立的野生家鼠种群中的当代连锁不平衡模式中构建了宽尺度和精细尺度的性别平均重组图,包括三个亚种的多个种群。比较野生小鼠种群和亚种之间的地图揭示了几个主题。首先,我们报告了亚种和种群之间弱的精细和广泛的重组图保护,遗传差异没有为重组图差异提供明确的预测。第二,大多数热点是一个群体独有的,结果与调查人群之间Prdm9等位基因的最小共享一致。最后,通过对比X和常染色体上的聚集热点活动,我们发现了在性别二态性重组的程度和方向上存在人群特异性差异的证据.总的来说,我们的发现阐明了小家鼠中宽尺度和细尺度重组景观的变异性,并强调了野生小鼠种群中Prdm9等位基因变异的功能影响。
    Meiotic recombination is an important evolutionary force and an essential meiotic process. In many species, recombination events concentrate into hotspots defined by the site-specific binding of PRMD9. Rapid evolution of Prdm9\'s zinc finger DNA-binding array leads to remarkably abrupt shifts in the genomic distribution of hotspots between species, but the question of how Prdm9 allelic variation shapes the landscape of recombination between populations remains less well understood. Wild house mice (Mus musculus) harbor exceptional Prdm9 diversity, with >150 alleles identified to date, and pose a particularly powerful system for addressing this open question. We employed a coalescent-based approach to construct broad- and fine-scale sex-averaged recombination maps from contemporary patterns of linkage disequilibrium in nine geographically isolated wild house mouse populations, including multiple populations from each of three subspecies. Comparing maps between wild mouse populations and subspecies reveals several themes. First, we report weak fine- and broad-scale recombination map conservation across subspecies and populations, with genetic divergence offering no clear prediction for recombination map divergence. Second, most hotspots are unique to one population, an outcome consistent with minimal sharing of Prdm9 alleles between surveyed populations. Finally, by contrasting aggregate hotspot activity on the X versus autosomes, we uncover evidence for population-specific differences in the degree and direction of sex dimorphism for recombination. Overall, our findings illuminate the variability of both the broad- and fine-scale recombination landscape in M. musculus and underscore the functional impact of Prdm9 allelic variation in wild mouse populations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    SARS-CoV-2(Betacoronavirus属,冠状病毒科)自2019年12月首次发现以来一直受到审查。虽然结构变体的作用,特别是删除,在病毒进化中很少探索,这些基因组变化非常频繁。它们与相关流程相关联,包括免疫逃逸和衰减。缺失通常发生在附件ORF中,甚至可能导致一个或多个ORF的完全丢失。这种情况提出了一个有趣的问题,即极端结构重排的起源和传播,这种重排在不损害病毒生存能力的情况下持续存在。这里,我们分析了2021年末乌拉圭SARS-CoV-2的基因组,并确定了一个Delta谱系(AY.20),该谱系经历了一个大的缺失(根据参考武汉菌株的872个核苷酸),去除7a,7b,8个ORF缺失的病毒与野生型(无缺失)AY.20和AY.43菌株共存。乌拉圭缺失与波兰和日本的Delta菌株中鉴定的缺失相似,但发生在不同的Delta进化枝中。除了提供这种大删除在美国流行的证据外,我们推断872缺失是由6个核苷酸缺失的连续发生引起的,三角洲菌株的特征,以及在AY.20乌拉圭谱系中独立出现的866个核苷酸的缺失。最大的缺失发生在合成用作转录模板的亚基因组mRNA嵌套集合所需的转录调节序列附近。我们的发现支持转录序列作为拷贝选择重组热点的作用,并强调了SARS-CoV-2基因组的显着动态。
    The genetic variability of SARS-CoV-2 (genus Betacoronavirus, family Coronaviridae) has been scrutinized since its first detection in December 2019. Although the role of structural variants, particularly deletions, in virus evolution is little explored, these genome changes are extremely frequent. They are associated with relevant processes, including immune escape and attenuation. Deletions commonly occur in accessory ORFs and might even lead to the complete loss of one or more ORFs. This scenario poses an interesting question about the origin and spreading of extreme structural rearrangements that persist without compromising virus viability. Here, we analyze the genome of SARS-CoV-2 in late 2021 in Uruguay and identify a Delta lineage (AY.20) that experienced a large deletion (872 nucleotides according to the reference Wuhan strain) that removes the 7a, 7b, and 8 ORFs. Deleted viruses coexist with wild-type (without deletion) AY.20 and AY.43 strains. The Uruguayan deletion is like those identified in Delta strains from Poland and Japan but occurs in a different Delta clade. Besides providing proof of the circulation of this large deletion in America, we infer that the 872-deletion arises by the consecutive occurrence of a 6-nucleotide deletion, characteristic of delta strains, and an 866-nucleotide deletion that arose independently in the AY.20 Uruguayan lineage. The largest deletion occurs adjacent to transcription regulatory sequences needed to synthesize the nested set of subgenomic mRNAs that serve as templates for transcription. Our findings support the role of transcription sequences as a hotspot for copy-choice recombination and highlight the remarkable dynamic of SARS-CoV-2 genomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    减数分裂是真核生物性生命周期的重要组成部分。减数分裂中染色体的独立分类增加了整个染色体水平的遗传多样性,减数分裂重组增加了染色体内的遗传多样性。由此产生的可变性推动了进化。有趣的是,不同分类群的重组全球图谱显示,其频率分布在密切相关的物种之间发生了巨大变化,亚种,甚至是同一物种的孤立种群。对这些进化上快速变化的机制的新见解来自对裂殖酵母中环境诱导的重组可塑性的分析。许多不同的DNA位点,并确定了它们的结合/激活蛋白,控制重组在热点的定位。每个不同类别的热点用作独立控制的变阻器,其响应于变化的条件在宽的动态范围内调节重组速率。一起,这种独立的调制可以迅速而显著地改变重组的全局频率分布。这一过程可能对(即,可以在很大程度上解释)进化迅速,重组格局中与Prdm9无关的变化。此外,精确的控制机制使细胞能够动态地支持或不支持新出现的连锁等位基因组合,以响应细胞外和细胞内条件的变化,这对减数分裂重组对进化的影响具有惊人的意义。
    Meiosis is an essential component of the sexual life cycle in eukaryotes. The independent assortment of chromosomes in meiosis increases genetic diversity at the level of whole chromosomes and meiotic recombination increases genetic diversity within chromosomes. The resulting variability fuels evolution. Interestingly, global mapping of recombination in diverse taxa revealed dramatic changes in its frequency distribution between closely related species, subspecies, and even isolated populations of the same species. New insight into mechanisms for these evolutionarily rapid changes has come from analyses of environmentally induced plasticity of recombination in fission yeast. Many different DNA sites, and where identified their binding/activator proteins, control the positioning of recombination at hotspots. Each different class of hotspots functions as an independently controlled rheostat that modulates rates of recombination over a broad dynamic range in response to changing conditions. Together, this independent modulation can rapidly and dramatically alter the global frequency distribution of recombination. This process likely contributes substantially to (i.e., can largely explain) evolutionarily rapid, Prdm9-independent changes in the recombination landscape. Moreover, the precise control mechanisms allow cells to dynamically favor or disfavor newly arising combinations of linked alleles in response to changing extracellular and intracellular conditions, which has striking implications for the impacts of meiotic recombination on evolution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人们早就知道(大约1917年)环境条件,以及物种形成,可以显着影响Po11/Rec12依赖性减数分裂重组的频率分布。这里,通过分析裂殖酵母裂殖酵母中DNA序列依赖性减数分裂重组热点,我们揭示了这些现象的分子基础。环境条件变化的影响(温度、营养素,和渗透压)对局部重组率的影响直接由DNA位点依赖性热点介导(M26,CCAAT,和Oligo-C)。这种控制是通过环境条件响应性信号转导网络(涉及Atf1,Pcr1,Php2,Php3,Php5和Rst2)来实现的。引人注目的是,个体热点响应于变化的条件在非常宽的动态范围内调节重组速率。它们的范围可以从静止到高度精通促进基础重组机制的活性(Po11/Rec12复合物)。此外,每个不同类别的热点都充当独立控制的变阻器;增加一个类别的活动的条件可以降低另一个类别的活动。一起,每个不同类型的DNA位点依赖性热点(其中有许多)对重组率的独立调节提供了高度动态的分子机制,减数分裂重组的全球频率分布的大规模变化。因为在裂殖酵母中发现的热点激活DNA位点在其他物种中功能上是保守的,这个过程也可以解释以前神秘的,Prdm9独立,密切相关的物种之间热点使用的进化快速变化,亚种,和同一物种的孤立种群。
    It has long been known (circa 1917) that environmental conditions, as well as speciation, can affect dramatically the frequency distribution of Spo11/Rec12-dependent meiotic recombination. Here, by analyzing DNA sequence-dependent meiotic recombination hotspots in the fission yeast Schizosaccharomyces pombe, we reveal a molecular basis for these phenomena. The impacts of changing environmental conditions (temperature, nutrients, and osmolarity) on local rates of recombination are mediated directly by DNA site-dependent hotspots (M26, CCAAT, and Oligo-C). This control is exerted through environmental condition-responsive signal transduction networks (involving Atf1, Pcr1, Php2, Php3, Php5, and Rst2). Strikingly, individual hotspots modulate rates of recombination over a very broad dynamic range in response to changing conditions. They can range from being quiescent to being highly proficient at promoting activity of the basal recombination machinery (Spo11/Rec12 complex). Moreover, each different class of hotspot functions as an independently controlled rheostat; a condition that increases the activity of one class can decrease the activity of another class. Together, the independent modulation of recombination rates by each different class of DNA site-dependent hotspots (of which there are many) provides a molecular mechanism for highly dynamic, large-scale changes in the global frequency distribution of meiotic recombination. Because hotspot-activating DNA sites discovered in fission yeast are conserved functionally in other species, this process can also explain the previously enigmatic, Prdm9-independent, evolutionarily rapid changes in hotspot usage between closely related species, subspecies, and isolated populations of the same species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    A salmon louse (Lepeophtheirus salmonis salmonis) genetic linkage map was constructed to serve as a genomic resource for future investigations into the biology of this important marine parasitic copepod species, and to provide insights into the inheritance patterns of genetic markers in this species. SNP genotyping of 8 families confirmed the presence of 15 linkage groups based upon the assignment of 93,773 markers. Progeny sample size weight adjusted map sizes in males (with the exception of SL12 and SL15) ranged in size from 96.50 cM (SL11) to 134.61 cM (SL06), and total combined map steps or bins ranged from 143 (SL09) to 203 (SL13). The SL12 male map was the smallest linkage group with a weight-averaged size of 3.05 cM with 6 recombination bins. Male:female specific recombination rate differences are 10.49:1 and represent one of the largest reported sex-specific differences for any animal species. Recombination ratio differences (M:F) ranged from 1.0 (SL12) to 29:1 (SL15). The number of markers exhibiting normal Mendelian segregation within the sex linkage group SL15 was extremely low (N = 80) in comparison to other linkage groups genotyped [range: 1459 (SL12)-10206 markers (SL05)]. Re-evaluation of Mendelian inheritance patterns of markers unassigned to any mapping parent according to hemizygous segregation patterns (models presented) identified matches for many of these markers to hemizygous patterns. The greatest proportion of these markers assigned to SL15 (N increased to 574). Inclusion of the hemizygous markers revised SL15 sex-specific recombination rate differences to 28:1. Recombination hot- and coldspots were identified across all linkage groups with all linkage groups possessing multiple peaks. Nine of 13 linkage groups evaluated possessed adjacent domains with hot-coldspot transitional zones. The most common pattern was for one end of the linkage to show elevated recombination in addition to internal regions. For SL01 and SL06, however, a terminal region with high recombination was not evident while a central domain possessing extremely high-recombination levels was present. High levels of recombination were weakly coupled to higher levels of SNP variation within domains, but this association was very strong for the central domains of SL01 and SL06. From the pooled paternal half-sib lots (several virgin females placed with 1 male), only 1 or two surviving family lots were obtained. Surviving families possessed parents where both the male and female possessed either inherently low or high recombination rates. This study provides insight into the organization of the sea louse genome, and describes large differences in recombination rate that exist among individuals of the same sex, and between the sexes. These differences in recombination rate may be coupled to the capabilities of this species to adapt to environmental and pharmaceutical treatments, given that family survivorship appears to be enhanced when parents have similar recombination levels.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    The programmed formation of hundreds of DNA double-strand breaks (DSBs) is essential for proper meiosis and fertility. In mice and humans, the location of these breaks is determined by the meiosis-specific protein PRDM9, through the DNA-binding specificity of its zinc-finger domain. PRDM9 also has methyltransferase activity. Here, we show that this activity is required for H3K4me3 and H3K36me3 deposition and for DSB formation at PRDM9-binding sites. By analyzing mice that express two PRDM9 variants with distinct DNA-binding specificities, we show that each variant generates its own set of H3K4me3 marks independently from the other variant. Altogether, we reveal several basic principles of PRDM9-dependent DSB site determination, in which an excess of sites are designated through PRDM9 binding and subsequent histone methylation, from which a subset is selected for DSB formation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Traditional plant breeding relies on meiotic recombination for mixing of parental alleles to create novel allele combinations. Detailed analysis of recombination patterns in model organisms shows that recombination is tightly regulated within the genome, but frequencies vary extensively along chromosomes. Despite being a model organism for fruit developmental studies, high-resolution recombination patterns are lacking in tomato. In this study, we developed a novel methodology to use low-coverage resequencing to identify genome-wide recombination patterns and applied this methodology on 60 tomato Recombinant Inbred Lines (RILs). Our methodology identifies polymorphic markers from the low-coverage resequencing population data and utilizes the same data to locate the recombination breakpoints in individuals by using a variable sliding window. We identified 1,445 recombination sites comprising 112 recombination prone regions enriched for AT-rich DNA motifs. Furthermore, the recombination prone regions in tomato preferably occurred in gene promoters over intergenic regions, an observation consistent with Arabidopsis thaliana, Zea mays and Mimulus guttatus. Overall, our cost effective method and findings enhance the understanding of meiotic recombination in tomato and suggest evolutionarily conserved recombination associated genomic features.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号