Regulatory Elements, Transcriptional

监管要素 ,转录
  • 文章类型: Journal Article
    BACKGROUND: Cis-regulatory elements (CREs) are crucial for regulating gene expression, and G-quadruplexes (G4s), as prototypal non-canonical DNA structures, may play a role in this regulation. However, the relationship between G4s and CREs, especially with non-promoter-like functional elements, requires further systematic investigation. We aimed to investigate the associations between G4s and human cCREs (candidate CREs) inferred from the Encyclopedia of DNA Elements (ENCODE) data.
    RESULTS: We found that G4s are prominently enriched in most types of cCREs, especially those with promoter-like signatures (PLS). The co-occurrence of CTCF signals with H3K4me3 or H3K27ac signals strengthens the association between cCREs and G4s. Genetic variants in G4s, particularly within their G-runs, exhibit higher regulatory potential and deleterious effects compared to cCREs. The G-runs within G4s near transcriptional start sites (TSSs) are more evolutionarily constrained compared to G-runs in cCREs, while those far from the TSS are relatively less conserved. The presence of G4s is often linked to a more favorable local chromatin environment for the activation and execution of regulatory function of cCREs, potentially attributable to the formation of G4 secondary structures. Finally, we discovered that G4-associated cCREs exhibit widespread activation in a variety of cancers.
    CONCLUSIONS: Our study suggests that G4s are integral components of human cis-regulatory elements, extending beyond their potential role in promoters. The G4 primary sequences are associated with the localization of CREs, while the G4 structures are linked to the activation of these elements. Therefore, we propose defining G4s as pivotal regulatory elements in the human genome.






  • 文章类型: Journal Article
    Cis-regulatory elements (CREs) are pivotal in orchestrating gene expression throughout diverse biological systems. Accurate identification and in-depth characterization of functional CREs are crucial for decoding gene regulation networks during cellular processes. In this study, we develop Kethoxal-Assisted Single-stranded DNA Assay for Transposase-Accessible Chromatin with Sequencing (KAS-ATAC-seq) to quantitatively analyze the transcriptional activity of CREs. A main advantage of KAS-ATAC-seq lies in its precise measurement of ssDNA levels within both proximal and distal ATAC-seq peaks, enabling the identification of transcriptional regulatory sequences. This feature is particularly adept at defining Single-Stranded Transcribing Enhancers (SSTEs). SSTEs are highly enriched with nascent RNAs and specific transcription factors (TFs) binding sites that define cellular identity. Moreover, KAS-ATAC-seq provides a detailed characterization and functional implications of various SSTE subtypes. Our analysis of CREs during mouse neural differentiation demonstrates that KAS-ATAC-seq can effectively identify immediate-early activated CREs in response to retinoic acid (RA) treatment. Our findings indicate that KAS-ATAC-seq provides more precise annotation of functional CREs in transcription. Future applications of KAS-ATAC-seq would help elucidate the intricate dynamics of gene regulation in diverse biological processes.






  • 文章类型: Journal Article
    Transposable elements (TEs) and other repetitive regions have been shown to contain gene regulatory elements, including transcription factor binding sites. However, regulatory elements harbored by repeats have proven difficult to characterize using short-read sequencing assays such as ChIP-seq or ATAC-seq. Most regulatory genomics analysis pipelines discard \"multimapped\" reads that align equally well to multiple genomic locations. Because multimapped reads arise predominantly from repeats, current analysis pipelines fail to detect a substantial portion of regulatory events that occur in repetitive regions. To address this shortcoming, we developed Allo, a new approach to allocate multimapped reads in an efficient, accurate, and user-friendly manner. Allo combines probabilistic mapping of multimapped reads with a convolutional neural network that recognizes the read distribution features of potential peaks, offering enhanced accuracy in multimapping read assignment. Allo also provides read-level output in the form of a corrected alignment file, making it compatible with existing regulatory genomics analysis pipelines and downstream peak-finders. In a demonstration application on CTCF ChIP-seq data, we show that Allo results in the discovery of thousands of new CTCF peaks. Many of these peaks contain the expected cognate motif and/or serve as TAD boundaries. We additionally apply Allo to a diverse collection of ENCODE ChIP-seq data sets, resulting in multiple previously unidentified interactions between transcription factors and repetitive element families. Finally, we show that Allo may be particularly beneficial in identifying ChIP-seq peaks at centromeres, near segmentally duplicated genes, and in younger TEs, enabling new regulatory analyses in these regions.






  • 文章类型: Journal Article
    Most genetic variants associated with psychiatric disorders are located in noncoding regions of the genome. To investigate their functional implications, we integrate epigenetic data from the PsychENCODE Consortium and other published sources to construct a comprehensive atlas of candidate brain cis-regulatory elements. Using deep learning, we model these elements\' sequence syntax and predict how binding sites for lineage-specific transcription factors contribute to cell type-specific gene regulation in various types of glia and neurons. The elements\' evolutionary history suggests that new regulatory information in the brain emerges primarily via smaller sequence mutations within conserved mammalian elements rather than entirely new human- or primate-specific sequences. However, primate-specific candidate elements, particularly those active during fetal brain development and in excitatory neurons and astrocytes, are implicated in the heritability of brain-related human traits. Additionally, we introduce PsychSCREEN, a web-based platform offering interactive visualization of PsychENCODE-generated genetic and epigenetic data from diverse brain cell types in individuals with psychiatric disorders and healthy controls.






  • 文章类型: Journal Article
    ChIP-Atlas ( presents a suite of data-mining tools for analyzing epigenomic landscapes, powered by the comprehensive integration of over 376 000 public ChIP-seq, ATAC-seq, DNase-seq and Bisulfite-seq experiments from six representative model organisms. To unravel the intricacies of chromatin architecture that mediates the regulome-initiated generation of transcriptional and phenotypic diversity within cells, we report ChIP-Atlas 3.0 that enhances clarity by incorporating additional tracks for genomic and epigenomic features within a newly consolidated \'annotation track\' section. The tracks include chromosomal conformation (Hi-C and eQTL datasets), transcriptional regulatory elements (ChromHMM and FANTOM5 enhancers), and genomic variants associated with diseases and phenotypes (GWAS SNPs and ClinVar variants). These annotation tracks are easily accessible alongside other experimental tracks, facilitating better elucidation of chromatin architecture underlying the diversification of transcriptional and phenotypic traits. Furthermore, \'Diff Analysis,\' a new online tool, compares the query epigenome data to identify differentially bound, accessible, and methylated regions using ChIP-seq, ATAC-seq and DNase-seq, and Bisulfite-seq datasets, respectively. The integration of annotation tracks and the Diff Analysis tool, coupled with continuous data expansion, renders ChIP-Atlas 3.0 a robust resource for mining the landscape of transcriptional regulatory mechanisms, thereby offering valuable perspectives, particularly for genetic disease research and drug discovery.






  • 文章类型: Journal Article
    The inability to scalably and precisely measure the activity of developmental cis-regulatory elements (CREs) in multicellular systems is a bottleneck in genomics. Here we develop a dual RNA cassette that decouples the detection and quantification tasks inherent to multiplex single-cell reporter assays. The resulting measurement of reporter expression is accurate over multiple orders of magnitude, with a precision approaching the limit set by Poisson counting noise. Together with RNA barcode stabilization via circularization, these scalable single-cell quantitative expression reporters provide high-contrast readouts, analogous to classic in situ assays but entirely from sequencing. Screening >200 regions of accessible chromatin in a multicellular in vitro model of early mammalian development, we identify 13 (8 previously uncharacterized) autonomous and cell-type-specific developmental CREs. We further demonstrate that chimeric CRE pairs generate cognate two-cell-type activity profiles and assess gain- and loss-of-function multicellular expression phenotypes from CRE variants with perturbed transcription factor binding sites. Single-cell quantitative expression reporters can be applied in developmental and multicellular systems to quantitatively characterize native, perturbed and synthetic CREs at scale, with high sensitivity and at single-cell resolution.






  • 文章类型: Journal Article
    Differential gene expression in response to perturbations is mediated at least in part by changes in binding of transcription factors (TFs) and other proteins at specific genomic regions. Association of these cis-regulatory elements (CREs) with their target genes is a challenging task that is essential to address many biological and mechanistic questions. Many current approaches rely on chromatin conformation capture techniques or single-cell correlational methods to establish CRE-to-gene associations. These methods can be effective but have limitations, including resolution, gaps in detectable association distances, and cost. As an alternative, we have developed DegCre, a nonparametric method that evaluates correlations between measurements of perturbation-induced differential gene expression and differential regulatory signal at CREs to score possible CRE-to-gene associations. It has several unique features, including the ability to use any type of CRE activity measurement, yield probabilistic scores for CRE-to-gene pairs, and assess CRE-to-gene pairings across a wide range of sequence distances. We apply DegCre to six data sets, each using different perturbations and containing a variety of regulatory signal measurements, including chromatin openness, histone modifications, and TF occupancy. To test their efficacy, we compare DegCre associations to Hi-C loop calls and CRISPR-validated CRE-to-gene associations, establishing good performance by DegCre that is comparable or superior to competing methods. DegCre is a novel approach to the association of CREs to genes from a perturbation-differential perspective, with strengths that are complementary to existing approaches and allow for new insights into gene regulation.






  • 文章类型: Journal Article
    Genetic predisposition to cardiac arrhythmias has been a field of intense investigation. Research initially focused on rare hereditary arrhythmias, but over the last two decades, the role of genetic variation (single nucleotide polymorphisms) in heart rate, rhythm, and arrhythmias has been taken into consideration as well. In particular, genome-wide association studies have identified hundreds of genomic loci associated with quantitative electrocardiographic traits, atrial fibrillation, and less common arrhythmias such as Brugada syndrome. A significant number of associated variants have been found to systematically localize in non-coding regulatory elements that control the tissue-specific and temporal transcription of genes encoding transcription factors, ion channels, and other proteins. However, the identification of causal variants and the mechanism underlying their impact on phenotype has proven difficult due to the complex tissue-specific, time-resolved, condition-dependent, and combinatorial function of regulatory elements, as well as their modest conservation across different model species. In this review, we discuss research efforts aimed at identifying and characterizing-trait-associated variant regulatory elements and the molecular mechanisms underlying their impact on heart rate or rhythm.






  • 文章类型: Journal Article
    Enhancer RNAs (eRNAs) are non-coding RNAs produced by transcriptional enhancers that are highly correlated with their activity. Using a capped nascent RNA sequencing (PRO-cap) dataset in human lymphoblastoid cell lines across 67 individuals, we identified inter-individual variation in the expression of over 80 thousand transcribed transcriptional regulatory elements (tTREs), in both enhancers and promoters. Co-expression analysis of eRNAs from tTREs across individuals revealed how enhancers are associated with each other and with promoters. Mid- to long-range co-expression showed a distance-dependent decay that was modified by TF occupancy. In particular, we found a class of \"bivalent\" TFs, including Cohesin, that both facilitate and isolate the interaction between enhancers and/or promoters, depending on their topology. At short distances, we observed strand-specific correlations between nearby eRNAs in both convergent and divergent orientations. Our results support a cooperative model of convergent eRNAs, consistent with eRNAs facilitating adjacent enhancers rather than interfering with each other. Therefore, our approach to infer functional interactions from co-expression analyses provided novel insights into the principles of enhancer interactions as a function of distance, orientation, and binding landscapes of TFs.






  • 文章类型: Journal Article
    Transcriptional regulatory elements (TREs) are the primary nodes that control developmental gene regulatory networks. In embryo stages, larvae, and adult differentiated red spherule cells of the sea urchin Strongylocentrotus purpuratus, transcriptionally engaged TREs are detected by Precision Run-On Sequencing (PRO-seq), which maps genome-wide at base pair resolution the location of paused or elongating RNA polymerase II (Pol II). In parallel, TRE accessibility is estimated by the Assay for Transposase-Accessible Chromatin using Sequencing (ATAC-seq). Our analysis identifies surprisingly early and widespread TRE accessibility in 4-cell cleavage embryos that is not necessarily followed by concurrent or subsequent transcription. TRE transcriptional differences identified by PRO-seq provide more contrast among embryonic stages than ATAC-seq accessibility differences, in agreement with the apparent excess of accessible but inactive TREs during embryogenesis. Global TRE accessibility reaches a maximum around the 20-hour late blastula stage, which coincides with the consolidation of major embryo regionalizations and peak histone variant H2A.Z expression. A transcriptional potency model based on labile nucleosome TRE occupancy driven by DNA sequences and the prevalence of histone variants is proposed in order to explain the basal accessibility of transcriptionally inactive TREs during embryogenesis. However, our results would not reconcile well with labile nucleosome models based on simple A/T sequence enrichment. In addition, a large number of distal TREs become transcriptionally disengaged during developmental progression, in support of an early Pol II paused model for developmental gene regulation that eventually resolves in transcriptional activation or silencing. Thus, developmental potency in early embryos may be facilitated by incipient accessibility and transcriptional pause at TREs.





