关键词: accessible regions cellular heterogeneity genomic distance network diffusion scATAC-seq

Mesh : Chromatin Immunoprecipitation Sequencing / methods Chromatin / genetics Genome Epigenomics Data Analysis

来  源:   DOI:10.1093/bib/bbae093   PDF(Pubmed)

Abstract:
Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) data provided new insights into the understanding of epigenetic heterogeneity and transcriptional regulation. With the increasing abundance of dataset resources, there is an urgent need to extract more useful information through high-quality data analysis methods specifically designed for scATAC-seq. However, analyzing scATAC-seq data poses challenges due to its near binarization, high sparsity and ultra-high dimensionality properties. Here, we proposed a novel network diffusion-based computational method to comprehensively analyze scATAC-seq data, named Single-Cell ATAC-seq Analysis via Network Refinement with Peaks Location Information (SCARP). SCARP formulates the Network Refinement diffusion method under the graph theory framework to aggregate information from different network orders, effectively compensating for missing signals in the scATAC-seq data. By incorporating distance information between adjacent peaks on the genome, SCARP also contributes to depicting the co-accessibility of peaks. These two innovations empower SCARP to obtain lower-dimensional representations for both cells and peaks more effectively. We have demonstrated through sufficient experiments that SCARP facilitated superior analyses of scATAC-seq data. Specifically, SCARP exhibited outstanding cell clustering performance, enabling better elucidation of cell heterogeneity and the discovery of new biologically significant cell subpopulations. Additionally, SCARP was also instrumental in portraying co-accessibility relationships of accessible regions and providing new insight into transcriptional regulation. Consequently, SCARP identified genes that were involved in key Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways related to diseases and predicted reliable cis-regulatory interactions. To sum up, our studies suggested that SCARP is a promising tool to comprehensively analyze the scATAC-seq data.
摘要:
使用测序(scATAC-seq)数据的转座酶可访问染色质的单细胞测定为理解表观遗传异质性和转录调控提供了新的见解。随着数据集资源的日益丰富,迫切需要通过专门为scATAC-seq设计的高质量数据分析方法来提取更多有用的信息。然而,分析scATAC-seq数据带来了挑战,因为它接近二值化,高稀疏性和超高维数特性。这里,我们提出了一种新的基于网络扩散的计算方法来全面分析scATAC-seq数据,通过具有峰值位置信息的网络细化(SCARP)进行命名为单细胞ATAC-seq分析。SCARP在图论框架下制定了网络细化扩散方法,以聚合来自不同网络订单的信息,有效地补偿scATAC-seq数据中的缺失信号。通过整合基因组上相邻峰之间的距离信息,SCARP还有助于描绘峰的共同可达性。这两项创新使SCARP能够更有效地获得单元和峰值的低维表示。我们已经通过充分的实验证明,SCARP促进了对scATAC-seq数据的出色分析。具体来说,SCARP表现出出色的小区聚类性能,能够更好地阐明细胞异质性和发现新的具有生物学意义的细胞亚群。此外,SCARP还有助于描绘可访问区域的共同可及性关系,并提供对转录调控的新见解。因此,SCARP鉴定了与疾病相关的关键京都基因和基因组百科全书(KEGG)途径有关的基因,并预测了可靠的顺式调节相互作用。总而言之,我们的研究表明,SCARP是全面分析scATAC-seq数据的有前景的工具.
公众号