关键词: ATAC-seq ChIP-seq NGS RNA-seq clustering differential gene expression gene-set enrichment miRNA-seq next-generation sequencing peak calling

来  源:   DOI:10.3390/biology13070492   PDF(Pubmed)

Abstract:
With the advent of next-generation sequencing (NGS), experimental techniques that capture the biological significance of DNA loci or RNA molecules have emerged as fundamental tools for studying the epigenome and transcriptional regulation on a genome-wide scale. The volume of the generated data and the underlying complexity regarding their analysis highlight the need for robust and easy-to-use computational analytic methods that can streamline the process and provide valuable biological insights. Our solution, aPEAch, is an automated pipeline that facilitates the end-to-end analysis of both DNA- and RNA-sequencing assays, including small RNA sequencing, from assessing the quality of the input sample files to answering meaningful biological questions by exploiting the rich information embedded in biological data. Our method is implemented in Python, based on a modular approach that enables users to choose the path and extent of the analysis and the representations of the results. The pipeline can process samples with single or multiple replicates in batches, allowing the ease of use and reproducibility of the analysis across all samples. aPEAch provides a variety of sample metrics such as quality control reports, fragment size distribution plots, and all intermediate output files, enabling the pipeline to be re-executed with different parameters or algorithms, along with the publication-ready visualization of the results. Furthermore, aPEAch seamlessly incorporates advanced unsupervised learning analyses by automating clustering optimization and visualization, thus providing invaluable insight into the underlying biological mechanisms.
摘要:
随着下一代测序(NGS)的出现,捕获DNA基因座或RNA分子的生物学意义的实验技术已经成为研究全基因组规模的表观基因组和转录调控的基本工具。生成的数据量和分析的潜在复杂性凸显了对强大且易于使用的计算分析方法的需求,这些方法可以简化流程并提供有价值的生物学见解。我们的解决方案,aPEAch,是一个自动化管道,有助于DNA和RNA测序分析的端到端分析,包括小RNA测序,从评估输入样本文件的质量到通过利用生物数据中嵌入的丰富信息回答有意义的生物问题。我们的方法是在Python中实现的,基于模块化方法,使用户能够选择分析的路径和程度以及结果的表示。管道可以批量处理单个或多个重复的样品,允许所有样品的分析的易用性和可重复性。aPEAch提供了各种样本指标,如质量控制报告,碎片大小分布图,和所有中间输出文件,使管道能够使用不同的参数或算法重新执行,以及结果的出版可视化。此外,aPEAch无缝地纳入先进的无监督学习分析通过自动化聚类优化和可视化,从而提供对潜在生物学机制的宝贵见解。
公众号