关键词: CeRNA Gene regulation LncRNA MiRNA Non-coding RNA RNA-Seq

来  源:   DOI:10.1016/j.compbiolchem.2024.108140

Abstract:
Long non-coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression and maintenance of genomic integrity through various interactions with DNA, RNA, and proteins. The availability of large-scale sequence data from various high-throughput platforms has opened possibilities to identify, predict, and functionally annotate lncRNAs. As a result, there is a growing demand for an integrative computational framework capable of identifying known lncRNAs, predicting novel lncRNAs, and inferring the downstream regulatory interactions of lncRNAs at the genome-scale. We present ETENLNC (End-To-End-Novel-Long-NonCoding), a user-friendly, integrative, open-source, scalable, and modular computational framework for identifying and analyzing lncRNAs from raw RNA-Seq data. ETENLNC employs six stringent filtration steps to identify novel lncRNAs, performs differential expression analysis of mRNA and lncRNA transcripts, and predicts regulatory interactions between lncRNAs, mRNAs, miRNAs, and proteins. We benchmarked ETENLNC against six existing tools and optimized it for desktop workstations and high-performance computing environments using data from three different species. ETENLNC is freely available on GitHub: https://github.com/EvolOMICS-TU/ETENLNC.
摘要:
长链非编码RNA(lncRNAs)通过与DNA的各种相互作用,在基因表达的调节和基因组完整性的维持中起着至关重要的作用。RNA,和蛋白质。来自各种高通量平台的大规模序列数据的可用性为识别,预测,并在功能上注释lncRNAs。因此,对能够识别已知lncRNAs的综合计算框架的需求越来越大,预测新的lncRNAs,并在基因组规模上推断lncRNAs的下游调控相互作用。我们提出了ETENLNC(端到端新颖长非编码),一个用户友好的,综合,开源,可扩展,和模块化计算框架,用于从原始RNA-Seq数据中识别和分析lncRNAs。ETENLNC采用六个严格的过滤步骤来鉴定新的lncRNAs,进行mRNA和lncRNA转录本的差异表达分析,并预测lncRNAs之间的调控相互作用,mRNA,miRNA,和蛋白质。我们将ETENLNC与六个现有工具进行了基准测试,并使用来自三个不同物种的数据针对桌面工作站和高性能计算环境进行了优化。在GitHub上免费提供ETENLNC:https://github.com/EvolOMICS-TU/ETENLNC。
公众号