sORFs

sORF
  • 文章类型: Journal Article
    背景:JUB1,一个含有过氧化氢诱导的转录因子的NAC结构域,在植物免疫中起着至关重要的作用。关于JUB1对小麦叶锈病的反应知之甚少。基因组学的最新发现也揭示了许多通常被认为是无功能的sORF,主张将它们纳入翻译的潜在监管参与者的必要性。然而,SORF上的甲基化是否跨越JUB1等调节基因的3UTR调节基因表达,尚不清楚。
    结果:在这项研究中,我们鉴定了小麦JUB1同源基因3UTR中两个sORF的甲基化状态,TaJUB1-L,CpG中的胞嘧啶残基,在小麦的两个近等基因系(HD2329)中,在疾病进展的不同时间点的CHH和CHG位点,在叶锈病发病过程中有无Lr24基因。这里,我们报告了在感染后24小时后,耐药等值线中3'UTR的sORF中发生的CpG二核苷酸的显着去甲基化。此外,通过RT-qPCR观察到的上调基因表达与sORF中CpG位点的去甲基化成正比。
    结论:我们的发现表明,TaJUB1-L可能是在叶锈病发病过程中提供耐受性的正调节因子,3'UTR的胞嘧啶甲基化可能充当其表达控制的开关。这些结果丰富了常规甲基化测定技术的潜在益处,用于以具有成本效益和机密的结论性方式在植物-病原体相互作用期间解开表观遗传学中未探索的谜团。
    BACKGROUND: JUB1, a NAC domain containing hydrogen peroxide-induced transcription factor, plays a critical role in plant immunity. Little is known about how JUB1 responds to leaf rust disease in wheat. Recent discoveries in genomics have also unveiled a multitude of sORFs often assumed to be non-functional, to argue for the necessity of including them as potential regulatory players of translation. However, whether methylation on sORFs spanning the 3\'UTR of regulatory genes like JUB1 modulate gene expression, remains unclear.
    RESULTS: In this study, we identified the methylation states of two sORFs in 3\'UTR of a homologous gene of JUB1 in wheat, TaJUB1-L, at cytosine residues in CpG, CHH and CHG sites at different time points of disease progression in two near-isogenic lines of wheat (HD2329), with and without Lr24 gene during leaf rust pathogenesis. Here, we report a significant demethylation of the CpG dinucleotides occurring in the sORFs of the 3\'UTR in the resistant isolines after 24 h post-infection. Also, the up-regulated gene expression observed through RT-qPCR was directly proportional to the demethylation of the CpG sites in the sORFs.
    CONCLUSIONS: Our findings indicate that TaJUB1-L might be a positive regulator in providing tolerance during leaf rust pathogenesis and cytosine methylation at 3\'UTR might act as a switch for its expression control. These results enrich the potential benefit of conventional methylation assay techniques for unraveling the unexplored enigma in epigenetics during plant-pathogen interaction in a cost-effective and confidentially conclusive manner.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    传统上,非编码RNA(ncRNAs)被认为是一类缺乏编码能力的RNA转录本;然而,技术的进步揭示了一些ncRNAs包含能够编码长度约150个氨基酸的微肽的小的开放阅读框(sORF)。sORF编码的微肽(SEP)已成为肝细胞癌(HCC)研究中的有趣实体,照亮这个以前未探索的领域。最近的研究强调了SEP在HCC发生和发展中的调节功能。一些SEP对HCC表现出抑制作用,但其他人促进了它的发展。这一发现彻底改变了HCC研究和临床管理的格局。这里,我们介绍了SEP的概念和特点,总结它们与HCC的关联,并阐明其在肝癌代谢中的致癌机制,信号通路,细胞增殖,和转移。此外,我们提出了一个循序渐进的工作流程,以调查HCC相关的SEP。最后,我们讨论了将SEP应用于HCC诊断和治疗的挑战和前景。这篇综述旨在促进发现,优化,与HCC相关的SEP的临床应用,激发早期诊断的发展,个性化,和肝癌的精准治疗策略。
    Traditionally, non-coding RNAs (ncRNAs) are regarded as a class of RNA transcripts that lack encoding capability; however, advancements in technology have revealed that some ncRNAs contain small open reading frames (sORFs) that are capable of encoding micropeptides of approximately 150 amino acids in length. sORF-encoded micropeptides (SEPs) have emerged as intriguing entities in hepatocellular carcinoma (HCC) research, shedding light on this previously unexplored realm. Recent studies have highlighted the regulatory functions of SEPs in the occurrence and progression of HCC. Some SEPs exhibit inhibitory effects on HCC, but others facilitate its development. This discovery has revolutionized the landscape of HCC research and clinical management. Here, we introduce the concept and characteristics of SEPs, summarize their associations with HCC, and elucidate their carcinogenic mechanisms in HCC metabolism, signaling pathways, cell proliferation, and metastasis. In addition, we propose a step-by-step workflow for the investigation of HCC-associated SEPs. Lastly, we discuss the challenges and prospects of applying SEPs in the diagnosis and treatment of HCC. This review aims to facilitate the discovery, optimization, and clinical application of HCC-related SEPs, inspiring the development of early diagnostic, individualized, and precision therapeutic strategies for HCC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    长非编码RNA(LncRNA)是长度超过200个核苷酸的非蛋白质编码转录物。深度测序技术已经揭示了lncRNAs可以携带可翻译的短开放阅读框(sORF)。然而,控制lncRNA翻译事件的调控机制仍然知之甚少。这里,我们彻底地检测到了序列,功能元件,以及与人类lncRNA翻译相关的结构特征。广泛的鉴定和分析表明,可翻译的lncRNAs包含更丰富的蛋白质编码相关序列特征,依赖于上限和不依赖于上限的翻译启动机制,和更稳定的二级结构,与不可翻译的lncRNAs相比。这些发现强烈支持lncRNAs作为新的小肽生产的储存库。基于影响平移的特征融合和极限梯度提升(XGBoost)算法,我们开发了第一个专门用于预测可翻译lncRNAs的计算工具,名为TransLncPred。基准实验结果表明,我们的方法在相同的训练和测试数据集上优于几种最新的RNA编码潜在预测工具。100次10倍交叉验证测试还表明,调控元件衍生的特征,特别是N7-甲基鸟苷(m7G)和内部核糖体进入位点(IRES),有助于预测性能的提高。
    Long non-coding RNAs (LncRNAs) are non-protein coding transcripts more than 200 nucleotides in length. Deep sequencing technologies have unveiled lncRNAs can harbor translatable short open reading frames (sORFs). Yet the regulatory mechanisms governing lncRNA translation events remain poorly understood. Here, we exhaustively detected the sequence, functional element, and structure features relevant to lncRNA translation in human. Extensive identification and analysis reveal that translatable lncRNAs contain richer protein-coding related sequence features, cap-dependent and cap-independent translation initiation mechanisms, and more stable secondary structures, as compared to untranslatable lncRNAs. These findings strongly support lncRNAs serve as a repository for the production of new small peptides. Based on the feature fusion affecting translation and the extreme gradient boosting (XGBoost) algorithm, we developed the first computational tool that dedicated for predicting translatable lncRNAs, named TransLncPred. Benchmark experimental results show that our method outperforms several state-of-the-art RNA coding potential prediction tools on the same training and testing datasets. The 100-time 10-fold cross-validation tests also demonstrate that regulatory element-derived features, especially N7-methylguanosine (m7G) and internal ribosome entry site (IRES), contribute to the improvement in predictive performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近,长非编码RNA(lncRNA)中的小开放阅读框(sORFs)已被证明编码小肽,可以帮助研究生物体的生长和发育机制.由于与生物实验相比,基于机器学习的计算方法成本更低,它们可用于鉴定sORFs并为生物学实验提供依据。然而,很少的计算方法和数据资源被用于鉴定植物lncRNA中的sORF。此外,当面临类不平衡问题时,机器学习模型会产生表现不佳的分类器。在这项研究中,提出了一种基于加权余弦距离的SMOTE(WCDSMOTE)的替代方法,该方法可以与特征选择进行交互,以合成少数类样本,并应用加权编辑最近邻(WENN)来清理多数类样本。因此,提出了混合采样WCDSMOTE-ENN来处理具有多角度特征的不平衡数据集。引入异构分类器集成来完成分类任务。因此,提出了一种基于类不平衡学习的新计算方法,以识别植物lncRNA(sORFplnc)中具有编码潜力的sORF。实验结果表明,在识别具有编码潜力的sORF方面,sORFplnc优于现有的计算方法。我们期望拟议的工作可以为相关研究提供参考,并为农业和生物医学做出贡献。
    Recently, small open reading frames (sORFs) in long noncoding RNA (lncRNA) have been demonstrated to encode small peptides that can help study the mechanisms of growth and development in organisms. Since machine learning-based computational methods are less costly compared with biological experiments, they can be used to identify sORFs and provide a basis for biological experiments. However, few computational methods and data resources have been exploited for identifying sORFs in plant lncRNA. Besides, machine learning models produce underperforming classifiers when faced with a class-imbalance problem. In this study, an alternative method called SMOTE based on weighted cosine distance (WCDSMOTE) which enables interaction with feature selection is put forward to synthesize minority class samples and weighted edited nearest neighbor (WENN) is applied to clean up majority class samples, thus, hybrid sampling WCDSMOTE-ENN is proposed to deal with imbalanced datasets with the multi-angle feature. A heterogeneous classifier ensemble is introduced to complete the classification task. Therefore, a novel computational method that is based on class-imbalance learning to identify the sORFs with coding potential in plant lncRNA (sORFplnc) is presented. Experimental results manifest that sORFplnc outperforms existing computational methods in identifying sORFs with coding potential. We anticipate that the proposed work can be a reference for relevant research and contribute to agriculture and biomedicine.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近年来,由于Ribo-Seq技术的发展和应用,在各种生物中鉴定的sORF数量迅速增加,因此短/小开放阅读框(sORF)的作用已得到越来越多的认可。对翻译mRNA的核糖体保护足迹(RPF)进行测序。然而,应特别注意用于鉴定植物中sORF的RPF,因为它们的体积较小(〜30nt)以及植物基因组的高度复杂性和重复性,特别是对于多倍体物种。在这项工作中,我们比较了鉴定植物SORF的不同方法,讨论每种方法的优缺点,为植物sORF研究中不同方法的选择提供指导。
    The roles of short/small open reading frames (sORFs) have been increasingly recognized in recent years due to the rapidly growing number of sORFs identified in various organisms due to the development and application of the Ribo-Seq technique, which sequences the ribosome-protected footprints (RPFs) of the translating mRNAs. However, special attention should be paid to RPFs used to identify sORFs in plants due to their small size (~30 nt) and the high complexity and repetitiveness of the plant genome, particularly for polyploidy species. In this work, we compare different approaches to the identification of plant sORFs, discuss the advantages and disadvantages of each method, and provide a guide for choosing different methods in plant sORF studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    长链非编码RNA(lncRNAs)是生物过程的重要调节因子。最近已经显示,一些lncRNA包括可以编码不超过100个氨基酸的小肽的小开放阅读框(sORF)。然而,现有的方法通常应用于人类和动物数据集,但仍然存在特征表示能力低的问题。因此,在植物lncRNAs中准确可靠地预测具有编码能力的sORFs是当务之急。本文提出了一种称为SORFPred的新方法,在其中,我们通过结合多尺度卷积和挤压激励网络来设计一个名为MCSEN的模型,以充分挖掘嵌入在sORF中的不同信息,集成和优化多个基于序列和物理化学特征描述符,构建了基于贝叶斯优化算法和ExtraTrees的两层预测分类器。已在三个物种的sORFs数据集和实验验证的sORFs数据集上评估了sORFd。结果表明,SORFPred优于现有方法,准确率为97.28%,97.06%精度,97.52%召回,拟南芥的F1分数为97.29%,与各种传统的浅层机器学习和深度学习模型相比,预测性能有了显著提高。
    Long non-coding RNAs (lncRNAs) are important regulators of biological processes. It has recently been shown that some lncRNAs include small open reading frames (sORFs) that can encode small peptides of no more than 100 amino acids. However, existing methods are commonly applied to human and animal datasets and still suffer from low feature representation capability. Thus, accurate and credible prediction of sORFs with coding ability in plant lncRNAs is imperative. This paper proposes a new method termed sORFPred, in which we design a model named MCSEN by combining multi-scale convolution and Squeeze-and-Excitation Networks to fully mine distinct information embedded in sORFs, integrate and optimize multiple sequence-based and physicochemical feature descriptors, and built a two-layer prediction classifier based on Bayesian optimization algorithm and Extra Trees. sORFPred has been evaluated on sORFs datasets of three species and experimentally validated sORFs dataset. Results indicate that sORFPred outperforms existing methods and achieves 97.28% accuracy, 97.06% precision, 97.52% recall, and 97.29% F1-score on Arabidopsis thaliana, which shows a significant improvement in prediction performance compared to various conventional shallow machine learning and deep learning models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    小基因(<150个核苷酸)在噬菌体基因组中被系统地忽视。我们采用大规模比较基因组学方法来预测约230万个噬菌体基因组重叠群中的>40,000个小基因家族。我们发现噬菌体基因组中的小基因比宿主原核基因组中的小基因流行约3倍。我们的方法丰富了在微生物组中翻译的小基因,表明鉴定出的小基因正在编码。超过9,000个家族编码潜在的分泌或跨膜蛋白,超过5,000个家族编码预测的抗CRISPR蛋白,超过500个家族编码预测的抗菌蛋白。通过结合同源性和基因组邻域分析,我们揭示了噬菌体生物学中巨大的新颖性和多样性,包括在多个宿主门中发现的小噬菌体基因,编码在宿主感染中起重要作用的蛋白质的小基因,和共享基因组邻域的小基因,其编码的蛋白质可能共享相关功能。
    Small genes (<150 nucleotides) have been systematically overlooked in phage genomes. We employ a large-scale comparative genomics approach to predict >40,000 small-gene families in ∼2.3 million phage genome contigs. We find that small genes in phage genomes are approximately 3-fold more prevalent than in host prokaryotic genomes. Our approach enriches for small genes that are translated in microbiomes, suggesting the small genes identified are coding. More than 9,000 families encode potentially secreted or transmembrane proteins, more than 5,000 families encode predicted anti-CRISPR proteins, and more than 500 families encode predicted antimicrobial proteins. By combining homology and genomic-neighborhood analyses, we reveal substantial novelty and diversity within phage biology, including small phage genes found in multiple host phyla, small genes encoding proteins that play essential roles in host infection, and small genes that share genomic neighborhoods and whose encoded proteins may share related functions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    With the rapid development of computational biology and deep sequencing technology, more and more studies have shown that a large number of non-classical open reading frames that have not been annotated and hidden in non-coding RNA can encode functional micropeptide. This article reviewed the current research status and technology strategy of gene sources, biological properties, predicted methods and functional verification of micropeptide, providing theoretical and reference basis for the subsequent discovery of micropeptides, research on regulatory mechanisms and development of novel targets and biomarkers.
    随着计算生物学和深度测序技术的飞速发展,愈来愈多的研究表明大量之前未被注释、隐藏在非编码RNA中的非经典开放阅读框(open reading frame, ORF)具有编码功能性微肽(micropeptide)的能力。本文对微肽的基因来源、生物性质、预测方法和功能验证的研究现状和技术策略展开综述,以期为后续开展微肽发现、研究调控机制以及新靶点、生物标志物的开发等提供理论基础和参考依据。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    非编码RNA(ncRNA)领域的一个令人兴奋的新兴主题是发现称为微肽(≤100个氨基酸)的短肽。其新颖的治疗机会仍未得到充分探索。微肽已被认为在生理和病理过程的多种物种中起重要的调节作用。基因组学研究表明,这些微肽是由隐藏在错误注释的ncRNAs中的小型开放阅读框(sORFs)编码的。通常是lncRNAs(长链非编码RNAs)和circRNAs(环状RNAs)。这些ncRNA编码的微肽已被证明有助于肿瘤发生,但由于翻译的sORF鉴定技术的挑战,对它们的病理机制知之甚少。这里,我们回顾了与人类肿瘤进展有关的最佳验证的微肽,并讨论了它们的治疗和/或预后潜力,同时,我们还就潜在编码RNA和微肽的概念提出了自己的建议.
    An exciting emerging topic in the noncoding RNA (ncRNA) field is the discovery of short peptides called micropeptides (≤100 amino acids), whose novel therapeutic opportunities remain under-explored. Micropeptides have been suggested to play essential regulatory roles in diverse species of physiological and pathological processes. Genomics studies have revealed that these micropeptides are encoded by small open reading frames (sORFs) concealed in misannotated ncRNAs, generally lncRNAs (long noncoding RNAs) and circRNAs (circular RNAs). These ncRNA-encoded micropeptides have been shown to contribute to tumorigenesis but little is known about their pathological mechanism because of challenges in translated sORF identification techniques. Here, we review the best-validated micropeptides involved in the progression of human tumors and discuss their therapeutic and/or prognostic potential, at the same time, we also give our own suggestions on the concept of potential-coding RNA and micropeptides.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    先进的分析技术,如核糖体分析和质谱,以及改进的生物信息学技术,促进了基因组注释领域的发展,并在人类基因组中识别出数千个可能编码的短开放阅读框(sORF)。sORF及其产物的发现使我们意识到人类基因组的复杂性远远大于以前的假设。这里,我们提供了由各种转录物编码的人微肽,如线粒体rRNA,长链非编码RNA,环状RNA,mRNA的上游,等等。
    Advanced analytic techniques, such as ribosome profiling and mass spectrometry, as well as improved bioinformatics technology, have promoted the field of genome annotation forward and have identified thousands of likely coding short open reading frames (sORFs) in the human genome. The discovery of sORFs and their products allows us to realize that the complexity of the human genome is far greater than previously assumed. Here, we provide a review of human micropeptides encoded by various transcripts such as mitochondrial rRNAs, long noncoding RNAs, circular RNAs, upstream of mRNAs, and so on.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号