Enhancer

增强子
  • 文章类型: Journal Article
    Self-transcribing active regulatory region sequencing (STARR-seq) is a high-throughput sequencing method capable of simultaneously discovering and validating all enhancers within the genome. In this method, candidate sequences are inserted into plasmid vectors and electroporated into cells. Acting as both enhancers and target genes, the self-transcription of these sequences will also be enhanced by themselves. By sequencing the transcriptome and comparing the results with the non-inserted control, the locations and activity of enhancers can be determined. In traditional enhancer discovery strategies, the chromatin open regions and transcription active regions were sequenced and predicted as enhancers. However, the activity of these putative enhancers could only be validated one by one without a high-throughput method. STARR-seq solved this limitation, allowing simultaneous enhancers discovery and activity validation in a high-throughput manner. Since the introduction of STARR-seq, it has been widely used to discover enhancers and validate enhancer activity in a number of organisms and cells. In this review, we present the traditional enhancer prediction methods and the basic principles, development history, specific applications of STARR-seq, and its future prospects, aiming to provide a reference for researchers in related fields conducting enhancer studies.
    自转录活性调节区测序(self-transcribing active regulatory region sequencing,STARR-seq)是一种可发现并同时验证全基因组增强子活性的高通量测序方法。其原理为:将待验证序列插入质粒载体并电转入细胞中,该序列在作为增强子提高靶基因转录的同时,其本身也作为靶基因被增强转录。通过对转录组进行测序,并对比未插入片段的测序结果,可获得增强子在基因组位置及活性的信息。在传统增强子研究方法中,通过对染色质开放区域和转录活性区域进行测序以预测增强子,但只能逐一验证预测结果,无法高通量验证增强子活性。STARR-seq技术解决了上述缺陷,可在对全基因组增强子高通量挖掘的同时,对其活性进行可靠的验证。自STARR-seq技术发明以来,已被广泛运用于不同物种与细胞中的增强子发现及活性验证研究。本文对传统增强子预测方法以及STARR-seq技术的基本原理、发展历史和具体运用进行了介绍,并对其发展前景进行展望,以期为后续增强子相关领域研究人员提供参考。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由于在多种癌症中发现的高RNA转录水平,骨髓瘤过表达基因(MYEOV)已被认为是原癌基因。包括骨髓瘤,乳房,肺,胰腺癌和食道癌。人类和其他灵长类动物中开放阅读框(ORF)的存在表明蛋白质编码潜力。然而,我们仍然缺乏功能性MYEOV蛋白的证据。尚未确定MYEOV过表达如何影响癌组织。在这项工作中,我们表明MYEOV可能已经起源,并且仍然可以作为增强剂,调节CCND1和LTO1。首先,使用公开的ATAC-STARR-seq数据证实了人类的MYEOV3增强子活性,在B细胞来源的GM12878细胞上进行。我们在多个健康人体组织中检测到增强子组蛋白标记H3K4me1和H3K27ac重叠MYEOV,其中包括B细胞,肝和肺组织。对3D基因组数据集的分析揭示了MYEOV-3推定增强子与原癌基因CCND1之间的染色质相互作用。BLAST搜索和多序列比对结果表明,这种人类增强子元件的DNA序列与两栖动物/羊膜生物的分歧是保守的,在所有哺乳动物中也发现了273bp的保守区域,甚至在鸡身上,它始终位于相应的CCND1直系同源物附近。此外,我们观察到四个非人灵长类动物的MYEOV直向同源物中的活性增强子状态的保守性,狗,老鼠,和老鼠。当研究小鼠的这个同源区域时,没有MYEOV的ORF,我们不仅观察到了增强子染色质状态,而且使用3D基因组相互作用数据发现了小鼠增强子同源物和Ccnd1之间的相互作用.这类似于在人类中观察到的相互作用,有趣的是,与两个物种的CTCF结合位点一致。一起来看,这表明MYEOV是一种灵长类动物特异性基因,具有起源于进化上较老的增强子区域的从头ORF。这种高度保守的推定增强子元件可以调节人和小鼠的CCND1,开启了使用非灵长类动物模型研究癌症中MYEOV调节功能的可能性。
    The myeloma overexpressed gene (MYEOV) has been proposed to be a proto-oncogene due to high RNA transcript levels found in multiple cancers, including myeloma, breast, lung, pancreas and esophageal cancer. The presence of an open reading frame (ORF) in humans and other primates suggests protein-coding potential. Yet, we still lack evidence of a functional MYEOV protein. It remains undetermined how MYEOV overexpression affects cancerous tissues. In this work, we show that MYEOV has likely originated and may still function as an enhancer, regulating CCND1 and LTO1. Firstly, MYEOV 3\' enhancer activity was confirmed in humans using publicly available ATAC-STARR-seq data, performed on B-cell-derived GM12878 cells. We detected enhancer histone marks H3K4me1 and H3K27ac overlapping MYEOV in multiple healthy human tissues, which include B cells, liver and lung tissue. The analysis of 3D genome datasets revealed chromatin interactions between a MYEOV-3\'-putative enhancer and the proto-oncogene CCND1. BLAST searches and multi-sequence alignment results showed that DNA sequence from this human enhancer element is conserved from the amphibians/amniotes divergence, with a 273 bp conserved region also found in all mammals, and even in chickens, where it is consistently located near the corresponding CCND1 orthologues. Furthermore, we observed conservation of an active enhancer state in the MYEOV orthologues of four non-human primates, dogs, rats, and mice. When studying this homologous region in mice, where the ORF of MYEOV is absent, we not only observed an enhancer chromatin state but also found interactions between the mouse enhancer homolog and Ccnd1 using 3D-genome interaction data. This is similar to the interaction observed in humans and, interestingly, coincides with CTCF binding sites in both species. Taken together, this suggests that MYEOV is a primate-specific gene with a de novo ORF that originated at an evolutionarily older enhancer region. This deeply conserved putative enhancer element could regulate CCND1 in both humans and mice, opening the possibility of studying MYEOV regulatory functions in cancer using non-primate animal models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    遗传工具的可用性严重限制了对哺乳动物脊髓内细胞类型的实验访问。为了能够访问较低的运动神经元(LMN)和LMN亚型,它的功能是整合来自大脑的信息,并通过效应肌的直接神经支配来控制运动,我们从小鼠和猕猴脊髓中生成了单细胞多体组数据集,并发现了每个神经元群体的推定增强剂。我们将这些增强子克隆到驱动报告荧光团的腺相关病毒载体(AAV)中,并在小鼠中对其进行功能筛选。然后使用成像和分子技术对最有前途的候选增强剂进行了广泛表征,并在大鼠和猕猴中进行了进一步测试,以显示LMN标记的保守性。此外,我们将增强子元件组合到单个载体中,以实现上运动神经元(UMN)和LMN的同时标记。这个前所未有的LMN工具包将使未来研究跨物种的细胞类型功能以及人类神经退行性疾病的潜在治疗干预措施成为可能。
    Experimental access to cell types within the mammalian spinal cord is severely limited by the availability of genetic tools. To enable access to lower motor neurons (LMNs) and LMN subtypes, which function to integrate information from the brain and control movement through direct innervation of effector muscles, we generated single cell multiome datasets from mouse and macaque spinal cords and discovered putative enhancers for each neuronal population. We cloned these enhancers into adeno-associated viral vectors (AAVs) driving a reporter fluorophore and functionally screened them in mouse. The most promising candidate enhancers were then extensively characterized using imaging and molecular techniques and further tested in rat and macaque to show conservation of LMN labeling. Additionally, we combined enhancer elements into a single vector to achieve simultaneous labeling of upper motor neurons (UMNs) and LMNs. This unprecedented LMN toolkit will enable future investigations of cell type function across species and potential therapeutic interventions for human neurodegenerative diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    冠状病毒对人类和动物健康构成全球威胁。研究策略中近似远程调控元件的长距离RNA-RNA相互作用是至关重要的,包括基因组环化,不连续转录,和转录增强子,旨在快速复制它们的大基因组,致病性,和免疫逃避。基于两个实验定义的冠状病毒增强子的一级序列和模拟的RNA-RNA相互作用,我们通过计算机一级和二级结构分析检测到各种冠状病毒中的潜在增强子,从系统发育上古老的禽传染性支气管炎病毒(IBV)到最近出现的SARS-CoV-2。这些潜在的增强子具有核心双链体形成区,可以在封闭和开放状态之间过渡,作为由病毒或宿主因子指导的分子开关。双链体开放状态将与病毒基因组中的远程序列配对并调节涉及病毒复制和宿主免疫逃避的下游关键基因的表达。始终如一,预测的IBV增强子区域或其远处靶标的变化与病毒减毒的情况一致,可能是由开放阅读框(ORF)3a免疫逃避蛋白表达降低所致。如果经过实验验证,带注释的增强子序列可以为结构预测工具和抗病毒干预提供信息.
    Coronaviruses constitute a global threat to human and animal health. It is essential to investigate the long-distance RNA-RNA interactions that approximate remote regulatory elements in strategies, including genome circularization, discontinuous transcription, and transcriptional enhancers, aimed at the rapid replication of their large genomes, pathogenicity, and immune evasion. Based on the primary sequences and modeled RNA-RNA interactions of two experimentally defined coronaviral enhancers, we detected via an in silico primary and secondary structural analysis potential enhancers in various coronaviruses, from the phylogenetically ancient avian infectious bronchitis virus (IBV) to the recently emerged SARS-CoV-2. These potential enhancers possess a core duplex-forming region that could transition between closed and open states, as molecular switches directed by viral or host factors. The duplex open state would pair with remote sequences in the viral genome and modulate the expression of downstream crucial genes involved in viral replication and host immune evasion. Consistently, variations in the predicted IBV enhancer region or its distant targets coincide with cases of viral attenuation, possibly driven by decreased open reading frame (ORF)3a immune evasion protein expression. If validated experimentally, the annotated enhancer sequences could inform structural prediction tools and antiviral interventions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    小鼠FOXA1和GATA4是先驱因子的原型,通过与ALB1基因增强子中的N1核小体结合来启动肝细胞发育。使用低温电子显微镜(cryo-EM),我们分别和组合确定了游离N1核小体及其与FOXA1和GATA4的复合物的结构。我们发现FOXA1和GATA4的DNA结合域主要识别接头DNA和核小体的内部位点,分别,而它们的内在无序区域与组蛋白H2A-H2B上的酸性斑块相互作用。FOXA1通过重新定位N1核小体有效增强GATA4结合。体内DNA编辑和生物信息学分析表明,FOXA1和GATA4的共结合模式在调节涉及肝细胞功能的基因中起重要作用。我们的结果揭示了FOXA1和GATA4通过核小体重新定位与核小体合作结合的机制,通过弯曲接头DNA和阻碍核小体包装来打开染色质。
    Mouse FOXA1 and GATA4 are prototypes of pioneer factors, initiating liver cell development by binding to the N1 nucleosome in the enhancer of the ALB1 gene. Using cryoelectron microscopy (cryo-EM), we determined the structures of the free N1 nucleosome and its complexes with FOXA1 and GATA4, both individually and in combination. We found that the DNA-binding domains of FOXA1 and GATA4 mainly recognize the linker DNA and an internal site in the nucleosome, respectively, whereas their intrinsically disordered regions interact with the acidic patch on histone H2A-H2B. FOXA1 efficiently enhances GATA4 binding by repositioning the N1 nucleosome. In vivo DNA editing and bioinformatics analyses suggest that the co-binding mode of FOXA1 and GATA4 plays important roles in regulating genes involved in liver cell functions. Our results reveal the mechanism whereby FOXA1 and GATA4 cooperatively bind to the nucleosome through nucleosome repositioning, opening chromatin by bending linker DNA and obstructing nucleosome packing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    重度哮喘患儿尽管接受了晚期治疗,但症状反复发作,生活质量受损。严重哮喘的根本原因尚未完全了解,尽管已知遗传机制很重要。
    这项研究的目的是鉴定白细胞中的基因调节增强子,描述这些增强剂在调节与儿童严重和轻度哮喘相关的基因中的作用,并鉴定位于增强剂附近的已知哮喘相关SNP。
    基因增强子被鉴定,增强子和基因的表达通过Cap分析基因表达(CAGE)数据从患有严重哮喘的儿童(n=13)的外周血白细胞测量,轻度哮喘(n=15),和年龄匹配的对照(n=9)。
    从一组完整的8289个确定的增强剂中,我们进一步定义了高置信度和最高表达的4,738个增强子的稳健子集。已知的单核苷酸多态性,SNPs,与哮喘相关的通常与增强子以及特定的增强子-基因相互作用一致。增强子簇的块与包括TGF-β,PPAR和IL-11信号以及与维生素A和D代谢相关的基因。91种增强剂的特征可区分为重度和轻度哮喘儿童以及对照组。
    在白细胞中发现了与儿童重度和轻度哮喘相关的基因调控增强子。携带已知SNP的增强子提供了制定关于这些SNP功能的机械假设的机会。
    UNASSIGNED: Children with severe asthma suffer from recurrent symptoms and impaired quality of life despite advanced treatment. Underlying causes of severe asthma are not completely understood, although genetic mechanisms are known to be important.
    UNASSIGNED: The aim of this study was to identify gene regulatory enhancers in leukocytes, to describe the role of these enhancers in regulating genes related to severe and mild asthma in children, and to identify known asthma-related SNPs situated in proximity to enhancers.
    UNASSIGNED: Gene enhancers were identified and expression of enhancers and genes were measured by Cap Analysis Gene Expression (CAGE) data from peripheral blood leukocytes from children with severe asthma (n = 13), mild asthma (n = 15), and age-matched controls (n = 9).
    UNASSIGNED: From a comprehensive set of 8,289 identified enhancers, we further defined a robust sub-set of the high-confidence and most highly expressed 4,738 enhancers. Known single nucleotide polymorphisms, SNPs, related to asthma coincided with enhancers in general as well as with specific enhancer-gene interactions. Blocks of enhancer clusters were associated with genes including TGF-beta, PPAR and IL-11 signaling as well as genes related to vitamin A and D metabolism. A signature of 91 enhancers distinguished between children with severe and mild asthma as well as controls.
    UNASSIGNED: Gene regulatory enhancers were identified in leukocytes with potential roles related to severe and mild asthma in children. Enhancers hosting known SNPs give the opportunity to formulate mechanistic hypotheses about the functions of these SNPs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    增强子在基因表达调控中至关重要,决定转录活性的特异性和时间,这突出了他们的鉴定对于解开基因调控的复杂性的重要性。因此,确定增强剂及其优势至关重要。基因组中的重复序列是相同或对称片段的重复序列。有大量证据表明,重复序列包含大量的遗传信息。因此,我们介绍W2V重复指数,设计用于鉴定增强子序列片段并通过分析增强子区域中的重复K聚体序列来评估其强度。利用word2vector算法进行数值转换,利用Manta射线觅食优化进行特征选择,该方法有效地捕获了K-mer序列的频率和分布。通过专注于重复的K-mer序列,它最大限度地降低了计算复杂性,并有助于分析较大的K值。实验表明,我们的方法在几乎所有指标上都优于所有其他高级方法。
    Enhancers are crucial in gene expression regulation, dictating the specificity and timing of transcriptional activity, which highlights the importance of their identification for unravelling the intricacies of genetic regulation. Therefore, it is critical to identify enhancers and their strengths. Repeated sequences in the genome are repeats of the same or symmetrical fragments. There has been a great deal of evidence that repetitive sequences contain enormous amounts of genetic information. Thus, We introduce the W2V-Repeated Index, designed to identify enhancer sequence fragments and evaluates their strength through the analysis of repeated K-mer sequences in enhancer regions. Utilizing the word2vector algorithm for numerical conversion and Manta Ray Foraging Optimization for feature selection, this method effectively captures the frequency and distribution of K-mer sequences. By concentrating on repeated K-mer sequences, it minimizes computational complexity and facilitates the analysis of larger K values. Experiments indicate that our method performs better than all other advanced methods on almost all indicators.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    胰腺癌是由该器官的内分泌或外分泌室引起的恶性肿瘤。来自外分泌来源的肿瘤占所有诊断的胰腺癌的90%以上。其中,胰腺导管腺癌(PDAC)是最常见的组织学亚型。在过去的四十年中,PDAC的五年生存率在5%至9%之间。直到最近才出现小幅增长,达到12%-13%,使这成为一种严重而致命的疾病。像其他癌症一样,PDAC起始源于遗传变化。然而,PDAC遗传驱动因素的治疗靶向仍然相对不成功,因此,近年来的焦点已扩展到疾病发病机理的非遗传因素。具体来说,已经提出表观遗传景观的动态变化促进肿瘤的生长和转移。强调了增强剂的重组,控制致癌基因表达的必需调控元件,通常标记为我的组蛋白3赖氨酸4单甲基化(H3K4me1)。H3K4me1通常由组蛋白赖氨酸甲基转移酶(KMTs)沉积。虽然在其他癌症类型中被很好地描述为癌基因,最近的工作扩大了KMT在胰腺癌中作为肿瘤抑制因子的作用。这里,我们综述了KMTs在PDAC开发和治疗中的作用和翻译意义。
    Pancreatic cancer is a malignancy arising from the endocrine or exocrine compartment of this organ. Tumors from exocrine origin comprise over 90% of all pancreatic cancers diagnosed. Of these, pancreatic ductal adenocarcinoma (PDAC) is the most common histological subtype. The five-year survival rate for PDAC ranged between 5 and 9% for over four decades, and only recently saw a modest increase to ∼12-13%, making this a severe and lethal disease. Like other cancers, PDAC initiation stems from genetic changes. However, therapeutic targeting of PDAC genetic drivers has remained relatively unsuccessful, thus the focus in recent years has expanded to the non-genetic factors underlying the disease pathogenesis. Specifically, it has been proposed that dynamic changes in the epigenetic landscape promote tumor growth and metastasis. Emphasis has been given to the re-organization of enhancers, essential regulatory elements controlling oncogenic gene expression, commonly marked my histone 3 lysine 4 monomethylation (H3K4me1). H3K4me1 is typically deposited by histone lysine methyltransferases (KMTs). While well characterized as oncogenes in other cancer types, recent work has expanded the role of KMTs as tumor suppressor in pancreatic cancer. Here, we review the role and translational significance for PDAC development and therapeutics of KMTs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    组蛋白修改,被称为组蛋白标记,是调节细胞内基因表达的关键。组蛋白标记的大量潜在组合在仅通过生物学实验方法解码调控机制方面提出了相当大的挑战。为了克服这一挑战,我们开发了一种叫做CatLearning的方法。它利用具有专门适应残差网络的改进的卷积神经网络架构来定量解释组蛋白标记并预测基因表达。该架构集成了高达500Kb的远程组蛋白信息,并在没有3D信息的情况下学习染色质相互作用特征。通过只使用一个组蛋白标记,CatLearning实现了高水平的准确性。此外,CatLearning通过模拟增强子和整个基因组的组蛋白修饰变化来预测基因表达。这些发现有助于理解组蛋白标记的结构,并为具有表观遗传变化的疾病开发诊断和治疗靶标。
    Histone modifications, known as histone marks, are pivotal in regulating gene expression within cells. The vast array of potential combinations of histone marks presents a considerable challenge in decoding the regulatory mechanisms solely through biological experimental approaches. To overcome this challenge, we have developed a method called CatLearning. It utilizes a modified convolutional neural network architecture with a specialized adaptation Residual Network to quantitatively interpret histone marks and predict gene expression. This architecture integrates long-range histone information up to 500Kb and learns chromatin interaction features without 3D information. By using only one histone mark, CatLearning achieves a high level of accuracy. Furthermore, CatLearning predicts gene expression by simulating changes in histone modifications at enhancers and throughout the genome. These findings help comprehend the architecture of histone marks and develop diagnostic and therapeutic targets for diseases with epigenetic changes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    长期或后COVID-19是在从COVID-19恢复后持续存在的症状。宿主遗传因素在长COVID-19的发展中起着至关重要的作用,GWAS研究在不同种族人群中确定了几个SNP/基因。在非洲裔美国人的人口两个SNPS,rs10999901(C>T,p=3.6E-08,OR=1.39,MAF-0,27,GRCH38,chr10:71584799bp)和rs1868001(G>A,p=6.7E-09,OR=1.40,MAF-0.46,GRCH38,chr10:71587815bp)和西班牙裔人口,rs3759084(A>C,p=9.7E-09,OR=1.56,MAF-0.17,chr12:81,110,156bp)与长COVID-19密切相关。所有这三个SNP都位于非编码区,这意味着它们在基因组中的调节功能。计算机解剖表明rs10999901和rs1868001与CDH23和C10orf105基因物理相互作用。两种SNP都充当远端增强子并与几种转录因子(TF)结合。Further,rs10999901SNP是在CD4++T细胞和单核细胞中甲基化的CpG,由于从C>T。rs3759084位于MYF5的启动子(-687bp)中,充当远端增强子,并与PTPRQ进行物理交互。这些结果为它们的关联提供了合理的解释,并为剖析长COVID-19症状发展的实验提供了基础。
    Long or Post COVID-19 is a condition of collected symptoms persisted after recovery from COVID-19. Host genetic factors play a crucial role in developing Long COVID-19, and GWAS studies identified several SNPs/genes in various ethnic populations. In African-American population two SNPS, rs10999901 (C>T, p = 3.6E-08, OR = 1.39, MAF-0,27, GRCH38, chr10:71584799 bp) and rs1868001 (G>A, p = 6.7E-09, OR = 1.40, MAF-0.46, GRCH38, chr10:71587815 bp) and in Hispanic population, rs3759084 (A>C, p = 9.7E-09, OR = 1.56, MAF-0.17, chr12: 81,110,156 bp) are strongly associated with Long COVID-19. All these three SNPs reside in noncoding regions implying their regulatory function in the genome. In silico dissection suggests that rs10999901 and rs1868001 physically interact with the CDH23 and C10orf105 genes. Both SNPs act as distant enhancers and bind with several transcription factors (TFs). Further, rs10999901 SNP is a CpG that is methylated in CD4++ T cells and monocytes and loses its methylation due to transition from C>T. rs3759084 is located in the promoter (- 687 bp) of MYF5, acts as a distant enhancer, and physically interacts with PTPRQ. These results offer plausible explanations for their association and provide the basis for experiments to dissect the development of symptoms of Long COVID-19.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号