segmental duplication

  • 文章类型: Journal Article
    染色体19q13上的分段重复的妊娠特异性糖蛋白(PSG)基因座可能是人类基因组中进化最快的基因座之一。它包含10个编码基因(PSG1-9,11)和一个主要的非编码基因(PSG10),在胎盘和肠道中表达,除了几个特征不佳的长链非编码RNA。我们报道长非编码RNAPSG8-AS1具有少突胶质细胞特异性表达模式,并与编码关键髓磷脂成分的基因共表达。PSG8-AS1在人脑发育过程中表现出两个表达峰,这与少突发生和髓鞘形成的最活跃时期相吻合。PSG8-AS1直系同源物在几种灵长类动物的基因组中发现,但仅在人类中发现了显着表达,提示其在髓鞘形成中的作用的最新进化起源。此外,因为染色体1p/19q的共缺失是少突胶质细胞瘤的基因组标记,在这些肿瘤中检测PSG8-AS1的表达。PSG8-AS1可能是神经胶质瘤的一个有前途的诊断生物标志物,在少突胶质细胞瘤中具有预后价值。
    The segmentally duplicated Pregnancy-specific glycoprotein (PSG) locus on chromosome 19q13 may be one of the most rapidly evolving in the human genome. It comprises ten coding genes (PSG1-9, 11) and one predominantly non-coding gene (PSG10) that are expressed in the placenta and gut, in addition to several poorly characterized long non-coding RNAs. We report that long non-coding RNA PSG8-AS1 has an oligodendrocyte-specific expression pattern and is co-expressed with genes encoding key myelin constituents. PSG8-AS1 exhibits two peaks of expression during human brain development coinciding with the most active periods of oligodendrogenesis and myelination. PSG8-AS1 orthologs were found in the genomes of several primates but significant expression was found only in the human, suggesting a recent evolutionary origin of its proposed role in myelination. Additionally, because co-deletion of chromosomes 1p/19q is a genomic marker of oligodendroglioma, expression of PSG8-AS1 was examined in these tumors. PSG8-AS1 may be a promising diagnostic biomarker for glioma, with prognostic value in oligodendroglioma.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近年来,犬基因组组装的数量急剧增加。重复是进化新颖性的重要来源,也容易发生组装错误。我们使用基因组自对齐和读取深度方法探索了9个犬基因组组装的重复内容。我们发现8.58%的基因组在canFam4组装中重复,源自德国牧羊犬Mischka,包括90.15%的未放置重叠群。突出了正确组装副本的持续困难,少于一半的读取深度和程序集对齐重复重叠,但是mCanLor1.2格陵兰狼大会显示出更大的一致性。进一步的研究显示存在与四个或更多个重复拷贝具有比对的多个区段。这些高复发重复对应于基因逆转录。我们在canFam4组装中从1,316个亲本基因中鉴定了3,892个候选逆转录,发现大约8.82%的重复碱基对涉及逆转录,证实这种机制是犬科动物基因复制的主要驱动因素。在其他八个最近的犬基因组组装中也发现了类似的模式,与支持更高质量的PacBioHiFimCanLor1.2组件的指标。狼和其他犬类装配体之间的比较发现,装配体之间共有92%的逆转录插入。通过计算自基因组分化以来的世代数,我们估计会出现新的回溯插入,平均而言,在3,514名出生中的1名。我们的分析说明了逆转录基因形成对犬基因组的影响,并强调了最近完成的犬装配中重复序列的可变表示。
    Recent years have seen a dramatic increase in the number of canine genome assemblies available. Duplications are an important source of evolutionary novelty and are also prone to misassembly. We explored the duplication content of nine canine genome assemblies using both genome self-alignment and read-depth approaches. We find that 8.58% of the genome is duplicated in the canFam4 assembly, derived from the German Shepherd Dog Mischka, including 90.15% of unplaced contigs. Highlighting the continued difficulty in properly assembling duplications, less than half of read-depth and assembly alignment duplications overlap, but the mCanLor1.2 Greenland wolf assembly shows greater concordance. Further study shows the presence of multiple segments that have alignments to four or more duplicate copies. These high-recurrence duplications correspond to gene retrocopies. We identified 3,892 candidate retrocopies from 1,316 parental genes in the canFam4 assembly and find that ∼8.82% of duplicated base pairs involve a retrocopy, confirming this mechanism as a major driver of gene duplication in canines. Similar patterns are found across eight other recent canine genome assemblies, with metrics supporting a greater quality of the PacBio HiFi mCanLor1.2 assembly. Comparison between the wolf and other canine assemblies found that 92% of retrocopy insertions are shared between assemblies. By calculating the number of generations since genome divergence, we estimate that new retrocopy insertions appear, on average, in 1 out of 3,514 births. Our analyses illustrate the impact of retrogene formation on canine genomes and highlight the variable representation of duplicated sequences among recently completed canine assemblies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    重复三重复/反向重复(DUP-TRP/INV-DUP)结构是复杂的基因组重排(CGR)。尽管它已被确定为基因组疾病和癌症基因组中重要的致病性DNA突变特征,其架构仍未解决。这里,我们通过调查通过阵列比较基因组杂交(aCGH)鉴定的24例患者的DNA,研究了DUP-TRP/INV-DUP的基因组结构,我们在这些患者身上发现了4种预测结构变异(SV)单倍型中存在4种的证据.使用短阅读基因组测序(GS)的组合,长读GS,光学基因组作图,和单细胞DNA模板链测序(strand-seq),在18个样本中解析了单倍型结构.4个样品中的模板转换点显示为反向重复序列对中100%核苷酸相似性的~2.2-5.5kb的片段。这些数据提供了反向低拷贝重复作为重组底物的实验证据。这种类型的CGR可以导致在易感剂量敏感基因座中产生多种SV单倍型的多个构象。
    The duplication-triplication/inverted-duplication (DUP-TRP/INV-DUP) structure is a complex genomic rearrangement (CGR). Although it has been identified as an important pathogenic DNA mutation signature in genomic disorders and cancer genomes, its architecture remains unresolved. Here, we studied the genomic architecture of DUP-TRP/INV-DUP by investigating the DNA of 24 patients identified by array comparative genomic hybridization (aCGH) on whom we found evidence for the existence of 4 out of 4 predicted structural variant (SV) haplotypes. Using a combination of short-read genome sequencing (GS), long-read GS, optical genome mapping, and single-cell DNA template strand sequencing (strand-seq), the haplotype structure was resolved in 18 samples. The point of template switching in 4 samples was shown to be a segment of ∼2.2-5.5 kb of 100% nucleotide similarity within inverted repeat pairs. These data provide experimental evidence that inverted low-copy repeats act as recombinant substrates. This type of CGR can result in multiple conformers generating diverse SV haplotypes in susceptible dosage-sensitive loci.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    富含亮氨酸的重复受体样激酶(LRR-RLK)代表植物中最大的受体样激酶(RLK)亚组。虽然一些LRR-RLK成员在调节与形态发生相关的各种植物生长过程中发挥作用,抗病性,和应激反应,大多数LRR-RLK基因的功能仍不清楚。在这项研究中,我们从山茶基因组中鉴定了397个LRR-RLK基因,并将它们分为16个亚家族。大约62%的CsLRR-RLK基因位于由分段重复产生的区域。表明CsLRR-RLK基因的扩增是由于片段重复。对基因表达模式的分析揭示了CsLRR-RLK基因在不同组织中的差异表达以及对胁迫的响应。此外,我们证明CssEMS1定位于细胞膜,可以补充拟南芥ems1突变体。这项研究是对茶中LRR-RLKs的初步深入进化研究,为未来研究其功能提供了基础。
    在线版本包含补充材料,可在10.1007/s12298-024-01458-1获得。
    Leucine-rich repeat receptor-like kinases (LRR-RLKs) represent the largest subgroup of receptor-like kinases (RLKs) in plants. While some LRR-RLK members play a role in regulating various plant growth processes related to morphogenesis, disease resistance, and stress response, the functions of most LRR-RLK genes remain unclear. In this study, we identified 397 LRR-RLK genes from the genome of Camellia sinensis and categorized them into 16 subfamilies. Approximately 62% of CsLRR-RLK genes are situated in regions resulting from segmental duplications, suggesting that the expansion of CsLRR-RLK genes is due to segmental duplications. Analysis of gene expression patterns revealed differential expression of CsLRR-RLK genes across different tissues and in response to stress. Furthermore, we demonstrated that CssEMS1 localizes to the cell membrane and can complement Arabidopsis ems1 mutant. This study is the initial in-depth evolutionary examination of LRR-RLKs in tea and provides a basis for future investigations into their functionality.
    UNASSIGNED: The online version contains supplementary material available at 10.1007/s12298-024-01458-1.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    长读测序数据,特别是来自牛津纳米孔测序平台的那些,往往表现出较高的错误率。这里,我们介绍NextDenovo,一个有效的纠错和组装工具,用于嘈杂的长时间读取,这在基因组组装中实现了高水平的准确性。我们应用NextDenovo使用Nanopore长读数据组装来自世界各地的35个不同的人类基因组。这些基因组使我们能够识别现代人群中片段复制和基因拷贝数变异的景观。NextDenovo的使用应该为使用Nanopore长读数据的群体规模长读组装铺平道路。
    Long-read sequencing data, particularly those derived from the Oxford Nanopore sequencing platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient error correction and assembly tool for noisy long reads, which achieves a high level of accuracy in genome assembly. We apply NextDenovo to assemble 35 diverse human genomes from around the world using Nanopore long-read data. These genomes allow us to identify the landscape of segmental duplication and gene copy number variation in modern human populations. The use of NextDenovo should pave the way for population-scale long-read assembly using Nanopore long-read data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Collembola是一个高度多样化和丰富的土壤节肢动物群体,染色体数量从5到11。先前的核型研究表明,Tomoceridae家族具有异常长的染色体。为了更好地了解Collembola的染色体大小进化,我们获得了一个染色体水平的基因组,大小为334.44Mb,BUSCO完整性为97.0%(n=1013)。Y.persimiis和Tomocerusqinae(最近出版)的基因组都具有异常大的染色体(ElChr大于100Mb),占基因组的近三分之一。比较基因组分析表明,大约1000万年前,这两个物种的染色体伸长是独立发生的,而不是在Tomoceridae家族的祖先中。ElChr伸长是由大的串联和分段重复引起的,以及转座子的增殖,与保守区域相比,这些区域中的基因经历较弱的纯化选择(较高的dN/dS)。此外,基因组间同系性分析表明,染色体分裂/融合事件在Entomobryomorpha染色体数目(5~7)的进化中起着至关重要的作用.这项研究为研究Collembola的染色体进化提供了宝贵的资源。
    Collembola is a highly diverse and abundant group of soil arthropods with chromosome numbers ranging from 5 to 11. Previous karyotype studies indicated that the Tomoceridae family possesses an exceptionally long chromosome. To better understand chromosome size evolution in Collembola, we obtained a chromosome-level genome of Yoshiicerus persimilis with a size of 334.44 Mb and BUSCO completeness of 97.0% (n = 1013). Both genomes of Y. persimilis and Tomocerus qinae (recently published) have an exceptionally large chromosome (ElChr greater than 100 Mb), accounting for nearly one-third of the genome. Comparative genomic analyses suggest that chromosomal elongation occurred independently in the two species approximately 10 million years ago, rather than in the ancestor of the Tomoceridae family. The ElChr elongation was caused by large tandem and segmental duplications, as well as transposon proliferation, with genes in these regions experiencing weaker purifying selection (higher dN/dS) than conserved regions. Moreover, inter-genomic synteny analyses indicated that chromosomal fission/fusion events played a crucial role in the evolution of chromosome numbers (ranging from 5 to 7) within Entomobryomorpha. This study provides a valuable resource for investigating the chromosome evolution of Collembola.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:位于端粒和染色体体之间的基因组区域,被称为端粒下,是异色的,repeat-rich,并经常进行重新安排。在这个区域内,大规模的结构变化使基因多样化,and,因此,大的多拷贝基因家族通常在端粒下发现。在一些寄生虫中,与增殖相关的基因,入侵,通常在这些地区找到生存,它们受益于端粒的高度可塑性,快速变化的自然。完整(或接近完整)寄生虫基因组的日益增加的可用性提供了一个机会来研究这些通常定义不清和被忽视的基因组区域,并可能揭示寄生虫生活方式所必需的相关基因家族。
    结果:使用最新的染色体尺度基因组组装和在染色体末端观察到的标志重复丰富度,我们已经鉴定并表征了曼氏血吸虫的亚端粒,一种后生寄生扁虫,感染全球超过2.5亿人。大约12%的曼索尼基因组被归类为亚端粒,and,与其他生物一致,我们发现这些区域基因贫乏,但富含转座因子。我们发现S.mansoni亚端粒已经经历了广泛的染色体间重组,并且这些位点不成比例地贡献了来自片段重复的基因组的2.3%。这种重组导致了包含103个基因的亚端粒基因簇的扩展,包括免疫调节膜联蛋白和其他作用未知的基因家族。其中最大的是含有49个拷贝的丛蛋白结构域的蛋白质簇,仅在皮膜-位于宿主-寄生虫物理界面的组织中表达。
    结论:我们建议亚端粒区充当基因复制和随后分歧的试错的基因组游乐场。由于亚端粒基因在其他寄生虫中的重要性,与曼森氏杆菌内的这种亚端粒扩增有关的基因家族需要进一步表征其在寄生中的潜在作用。
    BACKGROUND: The genomic region that lies between the telomere and chromosome body, termed the subtelomere, is heterochromatic, repeat-rich, and frequently undergoes rearrangement. Within this region, large-scale structural changes enable gene diversification, and, as such, large multicopy gene families are often found at the subtelomere. In some parasites, genes associated with proliferation, invasion, and survival are often found in these regions, where they benefit from the subtelomere\'s highly plastic, rapidly changing nature. The increasing availability of complete (or near complete) parasite genomes provides an opportunity to investigate these typically poorly defined and overlooked genomic regions and potentially reveal relevant gene families necessary for the parasite\'s lifestyle.
    RESULTS: Using the latest chromosome-scale genome assembly and hallmark repeat richness observed at chromosome termini, we have identified and characterised the subtelomeres of Schistosoma mansoni, a metazoan parasitic flatworm that infects over 250 million people worldwide. Approximately 12% of the S. mansoni genome is classified as subtelomeric, and, in line with other organisms, we find these regions to be gene-poor but rich in transposable elements. We find that S. mansoni subtelomeres have undergone extensive interchromosomal recombination and that these sites disproportionately contribute to the 2.3% of the genome derived from segmental duplications. This recombination has led to the expansion of subtelomeric gene clusters containing 103 genes, including the immunomodulatory annexins and other gene families with unknown roles. The largest of these is a 49-copy plexin domain-containing protein cluster, exclusively expressed in the tegument-the tissue located at the host-parasite physical interface-of intramolluscan life stages.
    CONCLUSIONS: We propose that subtelomeric regions act as a genomic playground for trial-and-error of gene duplication and subsequent divergence. Owing to the importance of subtelomeric genes in other parasites, gene families implicated in this subtelomeric expansion within S. mansoni warrant further characterisation for a potential role in parasitism.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    已知光学基因组作图(OGM)是用于染色体畸变检测的一体化技术。然而,也有超出OGM检测范围的像差。本研究旨在报告OGM遗漏的像差,并分析其影响因素。通过将GRCh37和GRCh38作为参考基因组进行OGM。以盲法方式分析OGM结果并与标准测定进行比较。质量控制(QC)指标,样品类型,参考基因组,然后分析了有效覆盖率以及畸变的类别和位置。总的来说,研究了来自123个样本的154个临床报告的变异。OGM未能检测到10(6.5%,10/154)使用GRCh37总成的像差,包括五个拷贝数变异(CNVs),两个亚微观平衡易位,两个外围倒位和一个同位染色体(镶嵌)。所有样品通过分析前和分析QC。使用GRCh38组件,OGM的假阴性率降至4.5%(7/154)。CNV的断点,OGM未检测到的平衡易位和倒位位于节段复制(SD)区域或无DLE-1标记的区域.总之,除了具有着丝粒断点的变化,具有位于大重复序列中的断点的结构变异(SV)也可能被OGM错过。当进行OGM时,推荐GRCh38作为参考基因组。我们的结果强调了在临床实践中充分了解OGM的检测范围和局限性的必要性。
    Optical genome mapping (OGM) has been known as an all-in-one technology for chromosomal aberration detection. However, there are also aberrations beyond the detection range of OGM. This study aimed to report the aberrations missed by OGM and analyze the contributing factors. OGM was performed by taking both GRCh37 and GRCh38 as reference genomes. The OGM results were analyzed in blinded fashion and compared to standard assays. Quality control (QC) metrics, sample types, reference genome, effective coverage and classes and locations of aberrations were then analyzed. In total, 154 clinically reported variations from 123 samples were investigated. OGM failed to detect 10 (6.5%, 10/154) aberrations with GRCh37 assembly, including five copy number variations (CNVs), two submicroscopic balanced translocations, two pericentric inversion and one isochromosome (mosaicism). All the samples passed pre-analytical and analytical QC. With GRCh38 assembly, the false-negative rate of OGM fell to 4.5% (7/154). The breakpoints of the CNVs, balanced translocations and inversions undetected by OGM were located in segmental duplication (SD) regions or regions with no DLE-1 label. In conclusion, besides variations with centromeric breakpoints, structural variations (SVs) with breakpoints located in large repetitive sequences may also be missed by OGM. GRCh38 is recommended as the reference genome when OGM is performed. Our results highlight the necessity of fully understanding the detection range and limitation of OGM in clinical practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    与WUSCHEL相关的同源异型盒(WOX)基因家族与促进营养器官向胚胎的过渡和维持植物胚胎干细胞的身份有关。使用全基因组分析,我们确定了17名候选人,苎麻(Boehmerianivea)中的WOX基因。基因(BnWOX)显示出WOX典型的高度保守的同源结构域区。根据系统发育分析,他们被分为三个不同的群体:现代,中间,和古老的进化枝。这些基因与拟南芥和水稻直系同源基因表现出65%和35%的共线性,分别,并展示了类似的图案,暗示类似的功能。此外,四个分段重复(BnWOX10/14,BnWOX13A/13B,BnWOX9A/9B,和BnWOX6A/Maker00021031)和推定的苎麻WOX基因之间的串联重复对(BnWOX5/7),提示全基因组重复(WGD)在WOX基因扩增中起作用.芽中基因的表达谱分析,leaf,茎,与qRT-PCR分析一致,茎插条的根显示茎和根中BnWOX10和BnWOX14的表达水平较高,叶中表达水平较低,表明它们在苎麻根形成中的直接作用。对67个不同的苎麻遗传资源的生根特性和在茎插条中的表达分析表明,BnWOX14可能参与了苎麻的不定生根。因此,本研究为苎麻WOX基因的研究提供了有价值的信息,为进一步的研究奠定了基础。
    A WUSCHEL-related homeobox (WOX) gene family has been implicated in promoting vegetative organs to embryonic transition and maintaining plant embryonic stem cell identity. Using genome-wide analysis, we identified 17 candidates, WOX genes in ramie (Boehmeria nivea). The genes (BnWOX) showed highly conserved homeodomain regions typical of WOX. Based on phylogenetic analysis, they were classified into three distinct groups: modern, intermediate, and ancient clades. The genes displayed 65% and 35% collinearities with their Arabidopsis thaliana and Oryza sativa ortholog, respectively, and exhibited similar motifs, suggesting similar functions. Furthermore, four segmental duplications (BnWOX10/14, BnWOX13A/13B, BnWOX9A/9B, and BnWOX6A/Maker00021031) and a tandem-duplicated pair (BnWOX5/7) among the putative ramie WOX genes were obtained, suggesting that whole-genome duplication (WGD) played a role in WOX gene expansion. Expression profiling analysis of the genes in the bud, leaf, stem, and root of the stem cuttings revealed higher expression levels of BnWOX10 and BnWOX14 in the stem and root and lower in the leaf consistent with the qRT-PCR analysis, suggesting their direct roles in ramie root formation. Analysis of the rooting characteristics and expression in the stem cuttings of sixty-seven different ramie genetic resources showed a possible involvement of BnWOX14 in the adventitious rooting of ramie. Thus, this study provides valuable information on ramie WOX genes and lays the foundation for further research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    五肽重复序列(PPR)基因家族是陆地植物中最大的基因家族之一。然而,目前关于PPR基因家族进化的知识仍然很有限。在这项研究中,我们对苜蓿及其野生祖先的PPR基因家族进行了比较基因组分析,O.Rufipogon,并概述了基因复制的综合景观。我们的发现表明,大多数PPR基因起源于分散的重复。尽管片段重复仅扩展了O.sativa和O.rufipogon基因组中PPR基因家族的约11.30%和13.57%,有趣的是,我们获得的证据表明,分段重复通过不完全的基因重复促进PPR基因的结构多样性。在O.sativa和O.rufipogon基因组中,10(〜33.33%)和22对基因重复(〜45.83%)通过不完全基因重复具有非PPR同源基因。导致不完整基因重复的分段重复可能导致域的获取,从而促进PPR基因的功能创新和结构多样化。这项研究为PPR基因结构的进化提供了独特的视角,并强调了片段重复在PPR基因结构多样性中的潜在作用。
    The pentatricopeptide repeat (PPR) gene family is one of the largest gene families in land plants. However, current knowledge about the evolution of the PPR gene family remains largely limited. In this study, we performed a comparative genomic analysis of the PPR gene family in O. sativa and its wild progenitor, O. rufipogon, and outlined a comprehensive landscape of gene duplications. Our findings suggest that the majority of PPR genes originated from dispersed duplications. Although segmental duplications have only expanded approximately 11.30% and 13.57% of the PPR gene families in the O. sativa and O. rufipogon genomes, we interestingly obtained evidence that segmental duplication promotes the structural diversity of PPR genes through incomplete gene duplications. In the O. sativa and O. rufipogon genomes, 10 (~33.33%) and 22 pairs of gene duplications (~45.83%) had non-PPR paralogous genes through incomplete gene duplication. Segmental duplications leading to incomplete gene duplications might result in the acquisition of domains, thus promoting functional innovation and structural diversification of PPR genes. This study offers a unique perspective on the evolution of PPR gene structures and underscores the potential role of segmental duplications in PPR gene structural diversity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号