
  • 文章类型: Journal Article
    Alternative polyadenylation plays an important role in cancer initiation and progression; however, current transcriptome-wide association studies mostly ignore alternative polyadenylation when identifying putative cancer susceptibility genes. Here, we perform a pan-cancer 3\' untranslated region alternative polyadenylation transcriptome-wide association analysis by integrating 55 well-powered (n > 50,000) genome-wide association studies datasets across 22 major cancer types with alternative polyadenylation quantification from 23,955 RNA sequencing samples across 7,574 individuals. We find that genetic variants associated with alternative polyadenylation are co-localized with 28.57% of cancer loci and contribute a significant portion of cancer heritability. We further identify 642 significant cancer susceptibility genes predicted to modulate cancer risk via alternative polyadenylation, 62.46% of which have been overlooked by traditional expression- and splicing- studies. As proof of principle validation, we show that alternative alleles facilitate 3\' untranslated region lengthening of CRLS1 gene leading to increased protein abundance and promoted proliferation of breast cancer cells. Together, our study highlights the significant role of alternative polyadenylation in discovering new cancer susceptibility genes and provides a strong foundational framework for enhancing our understanding of the etiology underlying human cancers.






  • 文章类型: Journal Article
    Alternative polyadenylation (APA) plays an essential role in brain development; however, current transcriptome-wide association studies (TWAS) largely overlook APA in nominating susceptibility genes. Here, we performed a 3\' untranslated region (3\'UTR) APA TWAS (3\'aTWAS) for 11 brain disorders by combining their genome-wide association studies data with 17,300 RNA-seq samples across 2,937 individuals. We identified 354 3\'aTWAS-significant genes, including known APA-linked risk genes, such as SNCA in Parkinson\'s disease. Among these 354 genes, ~57% are not significant in traditional expression- and splicing-TWAS studies, since APA may regulate the translation, localization and protein-protein interaction of the target genes independent of mRNA level expression or splicing. Furthermore, we discovered ATXN3 as a 3\'aTWAS-significant gene for amyotrophic lateral sclerosis, and its modulation substantially impacted pathological hallmarks of amyotrophic lateral sclerosis in vitro. Together, 3\'aTWAS is a powerful strategy to nominate important APA-linked brain disorder susceptibility genes, most of which are largely overlooked by conventional expression and splicing analyses.






  • 文章类型: Evaluation Study
    Alternative cleavage and polyadenylation (APA), an RNA processing event, occurs in over 70% of human protein-coding genes. APA results in mRNA transcripts with distinct 3\' ends. Most APA occurs within 3\' UTRs, which harbor regulatory elements that can impact mRNA stability, translation, and localization.
    APA can be profiled using a number of established computational tools that infer polyadenylation sites from standard, short-read RNA-seq datasets. Here, we benchmarked a number of such tools-TAPAS, QAPA, DaPars2, GETUTR, and APATrap- against 3\'-Seq, a specialized RNA-seq protocol that enriches for reads at the 3\' ends of genes, and Iso-Seq, a Pacific Biosciences (PacBio) single-molecule full-length RNA-seq method in their ability to identify polyadenylation sites and quantify polyadenylation site usage. We demonstrate that 3\'-Seq and Iso-Seq are able to identify and quantify the usage of polyadenylation sites more reliably than computational tools that take short-read RNA-seq as input. However, we find that running one such tool, QAPA, with a set of polyadenylation site annotations derived from small quantities of 3\'-Seq or Iso-Seq can reliably quantify variation in APA across conditions, such asacross genotypes, as demonstrated by the successful mapping of alternative polyadenylation quantitative trait loci (apaQTL).
    We envisage that our analyses will shed light on the advantages of studying APA with more specialized sequencing protocols, such as 3\'-Seq or Iso-Seq, and the limitations of studying APA with short-read RNA-seq. We provide a computational pipeline to aid in the identification of polyadenylation sites and quantification of polyadenylation site usages using Iso-Seq data as input.






  • 文章类型: Journal Article
    At present, there is no clear understanding of the effect of long-duration spaceflight on the major enzymes that govern the metabolism of omega-6 and omega-3 fatty acids. To address this gap in knowledge, we used data from the NASA Twins Study, which includes a multiscale omics investigation of the changes that occurred during a year-long (340 days) human spaceflight. Embedded within the NASA Twins data are specific analytes associated with fatty acid metabolism.
    To examine the long-chain fatty acid desaturases and elongases in a single human during 1 year in space.
    One male twin was on board the International Space Station (ISS) for 1 year, while his monozygotic twin served as a genetically matched ground control. Longitudinal assessments included the genome, epige-nome, transcriptome, proteome, metabolome, microbiome, and immunome during the mission, as well as 6 months before and after. The gene-specific fatty acid desaturase and elongase transcriptome data (FADS1, FADS2, ELOVL2, and ELOVL5) were extracted from untargeted RNA-seq measurements derived from white blood cell fractions.
    Most data from the elongases and desaturases exhibited relatively similar expression profiles (R2 >0.6) over time for the CD8, CD19, and lymphocyte-depleted (LD) cell fractions, indicating overall conservation of function within and between the subjects. Both cell-type and temporal specificity was observed in some cases, and some differences were also apparent between the polyadenylated (polyA) fraction of processed RNAs versus the ribodepleted (ribo-) fraction. The flight subject showed a stronger enrichment of the fatty acid metabolic process pathway across almost all cell types (columns, CD4, CD8, CPT, and LD), most especially in the ribodepleted fraction of RNA, but also with the polyA+ fraction of RNA. Gene set enrichment analysis (GSEA) measures across three related fatty acid metabolism pathways showed a differential between the ground and the flight subject.
    There appears to be no persistent alteration of desaturase and elongase gene expression associated with 1 year in space. However, these data provide evidence that cellular lipid metabolism can be responsive and dynamic to spaceflight, even though it appears cell-type and context specific, most notably in terms of the fraction of RNA measured and the collection protocols. These results also provide new evidence of mid-flight spikes in expression of selected genes, which may indicate transient responses to specific insults during spaceflight.






  • 文章类型: Journal Article
    Myotonic dystrophy type 1 (DM1) is a rare genetic disorder, characterised by muscular dystrophy, myotonia, and other symptoms. DM1 is caused by the expansion of a CTG repeat in the 3\'-untranslated region of DMPK. Longer CTG expansions are associated with greater symptom severity and earlier age at onset. The primary mechanism of pathogenesis is thought to be mediated by a gain of function of the CUG-containing RNA, that leads to trans-dysregulation of RNA metabolism of many other genes. Specifically, the alternative splicing (AS) and alternative polyadenylation (APA) of many genes is known to be disrupted. In the context of clinical trials of emerging DM1 treatments, it is important to be able to objectively quantify treatment efficacy at the level of molecular biomarkers. We show how previously described candidate mRNA biomarkers can be used to model an effective reduction in CTG length, using modern high-dimensional statistics (machine learning), and a blood and muscle mRNA microarray dataset. We show how this model could be used to detect treatment effects in the context of a clinical trial.







  • 文章类型: Journal Article
    X-linked hypophosphatemia (XLH), the most prevalent heritable renal phosphate (Pi) wasting disorder, is caused by deactivating mutations of PHEX. Consequently, circulating phosphatonin FGF23 becomes elevated and hypophosphatemia in affected children leads to rickets with skeletal deformity and reduced linear growth while affected adults suffer from osteomalacia and forms of ectopic mineralization. In 2015, we reported uniquely mild XLH in six children and four of their mothers carrying the non-coding PHEX 3\'-UTR mutation c.*231A>G. Herein, we characterize this mild XLH variant by comparing its features in 30 individuals to 30 age- and sex-matched patients with XLH but without the 3\'-UTR mutation. The \"UTR\" and \"XLH\" groups, both comprising 17 children (2 to 17 years, 3 girls) and 13 adults (23 to 63 years, 10 women), had mean ages of 23 years. Only 43% of the UTR group versus 90% of the XLH group had received medical treatment for their disorder, including 0% versus 85% of the females, respectively (ps < .0001). The UTR group was taller: mean ± SD height Z-score (HZ) -1.0 ± 1.0 versus -2.0 ± 1.4 (p = .0034), with significantly greater height for females (-0.9 ± 0.7 versus -2.3 ± 1.4; p = .0050) but not males (-1.2 ± 1.1 versus -1.9 ± 1.5; p = .1541), respectively. Mean ± SD \"arm span Z-score\" (AZ) did not differ between the UTR -0.8 ± 1.3 versus XLH -1.3 ± 1.8 groups (p = .2269). Consequently, the UTR group was more proportionate with a mean ∆Z (AZ - HZ) of 0.1 ± 0.6 versus 0.7 ± 1.0 (p = .0158), respectively. Compared to the XLH group, the UTR group had significantly higher fasting serum Pi and renal tubular threshold maximum for phosphorus per glomerular filtration rate (TmP/GFR) (ps ≤ .0060), serum FGF23 concentrations within the reference range (p = .0068), and similar serum alkaline phosphatase levels (p = .6513). UTR lumbar spine bone mineral density Z-score was higher (p = .0343). Thus, the 3\'-UTR variant of XLH is distinctly mild, especially in girls and women, posing challenges for its recognition and management. © 2020 American Society for Bone and Mineral Research.






  • 文章类型: Journal Article
    The CFIm25 subunit of the heterotetrameric cleavage factor Im (CFIm) is a critical factor in the formation of the poly(A) tail at mRNA 3\' end, regulating the recruitment of polyadenylation factors, poly(A) site selection, and cleavage/polyadenylation reactions. We previously reported the homologous protein (EhCFIm25) in Entamoeba histolytica, the protozoan causing human amoebiasis, and showed the relevance of conserved Leu135 and Tyr236 residues for RNA binding. We also identified the GUUG sequence as the recognition site of EhCFIm25. To understand the interactions network that allows the EhCFIm25 to maintain its three-dimensional structure and function, here we performed molecular dynamics simulations of wild-type (WT) and mutant proteins, alone or interacting with the GUUG molecule. Our results indicated that in the presence of the GUUG sequence, WT converged more quickly to lower RMSD values in comparison with mutant proteins. However, RMSF values showed that movements of amino acids of WT and EhCFIm25*L135 T were almost identical, interacting or not with the GUUG molecule. Interestingly, EhCFIm25*L135 T, which is the only mutant with a slight RNA binding activity experimentally, presents the same stabilization of bend structures and alpha helices as WT, notably in the C-terminus. Moreover, WT and EhCFIm25*L135 T presented almost the same number of contacts that mainly involve lysine residues interacting with the G4 nucleotide. Overall, our data proposed a clear description of the structural and mechanistic data that govern the RNA binding capacity of EhCFIm25.






  • 文章类型: Journal Article
    We propose a new analytical scheme in which field-flow fractionation (FFF)-based separation of target-specific polystyrene (PS) particle probes of different sizes are incorporated with amplified surface-enhanced Raman scattering (SERS) tagging for the simultaneous and sensitive detection of multiple microRNAs (miRNAs). For multiplexed detection, PS particles of three different diameters (15, 10, 5 μm) were used for the size-coding, and a probe single stranded DNA (ssDNA) complementary to a target miRNA was conjugated on an intended PS particle. After binding of a target miRNA on PS probe, polyadenylation reaction was executed to generate a long tail composed of adenine (A) serving as a binding site to thymine (T) conjugated Au nanoparticles (T-AuNPs) to increase SERS intensity. The three size-coded PS probes bound with T-AuNPs were then separated in a FFF channel. With the observation of extinction-based fractograms, separation of three size-coded PS probes was clearly confirmed, thereby enabling of measuring three miRNAs simultaneously. Raman intensities of FFF fractions collected at the peak maximum of 15, 10 and 5 μm PS probes varied fairy quantitatively with the change of miRNA concentrations, and the reproducibility of measurement was acceptable. The proposed method is potentially useful for simultaneous detection of multiple miRNAs with high sensitivity.






  • 文章类型: Journal Article
    Alternative polyadenylation has been recognized as a key contributor of gene expression regulation by generating different transcript isoforms with altered 3\' ends. Although polyadenylation is well known for marking the end of a 3\' UTR, an increasing number of studies have reported previously less-addressed polyadenylation events located in other parts of genes in many eukaryotic organisms. These other locations include 5\' UTRs, introns and coding sequences (termed herein as non-3UTR), as well as antisense and intergenic polyadenlation. Focusing on the non-3UTR polyadenylation sites (n3PASs), we detected and characterized more than 11000 n3PAS clusters in the Arabidopsis genome using poly(A)-tag sequencing data (PAT-Seq). Further analyses suggested that the occurrence of these n3PASs were positively correlated with certain characteristics of their respective host genes, including the presence of spliced, diminutive or diverse beginning of 5\' UTRs, number of introns and whether introns have extreme lengths. The interaction of the host genes with surrounding genetic elements, like a convergently overlapped gene and associated transposable element, may contribute to the generation of a n3PAS as well. Collectively, these results provide a better understanding of n3PASs, and offer some new insights of the underlying mechanisms for non-3UTR polyadenylation and its regulation in plants.






  • 文章类型: Journal Article
    A major objective of systems biology is to quantitatively integrate multiple parameters from genome-wide measurements. To integrate gene expression with dynamics in poly(A) tail length and adenylation site, we developed a targeted next-generation sequencing approach, Poly(A)-Test RNA-sequencing. PAT-seq returns (i) digital gene expression, (ii) polyadenylation site/s, and (iii) the polyadenylation-state within and between eukaryotic transcriptomes. PAT-seq differs from previous 3\' focused RNA-seq methods in that it depends strictly on 3\' adenylation within total RNA samples and that the full-native poly(A) tail is included in the sequencing libraries. Here, total RNA samples from budding yeast cells were analyzed to identify the intersect between adenylation state and gene expression in response to loss of the major cytoplasmic deadenylase Ccr4. Furthermore, concordant changes to gene expression and adenylation-state were demonstrated in the classic Crabtree-Warburg metabolic shift. Because all polyadenylated RNA is interrogated by the approach, alternative adenylation sites, noncoding RNA and RNA-decay intermediates were also identified. Most important, the PAT-seq approach uses standard sequencing procedures, supports significant multiplexing, and thus replication and rigorous statistical analyses can for the first time be brought to the measure of 3\'-UTR dynamics genome wide.





