关键词: diffusion model imputation scRNA-seq data spatial transcriptomics data

Mesh : Gene Expression Profiling Benchmarking Cluster Analysis Diffusion Markov Chains Sequence Analysis, RNA Transcriptome

来  源:   DOI:10.1093/bib/bbae171   PDF(Pubmed)

Abstract:
Spatial transcriptomics (ST) has become a powerful tool for exploring the spatial organization of gene expression in tissues. Imaging-based methods, though offering superior spatial resolutions at the single-cell level, are limited in either the number of imaged genes or the sensitivity of gene detection. Existing approaches for enhancing ST rely on the similarity between ST cells and reference single-cell RNA sequencing (scRNA-seq) cells. In contrast, we introduce stDiff, which leverages relationships between gene expression abundance in scRNA-seq data to enhance ST. stDiff employs a conditional diffusion model, capturing gene expression abundance relationships in scRNA-seq data through two Markov processes: one introducing noise to transcriptomics data and the other denoising to recover them. The missing portion of ST is predicted by incorporating the original ST data into the denoising process. In our comprehensive performance evaluation across 16 datasets, utilizing multiple clustering and similarity metrics, stDiff stands out for its exceptional ability to preserve topological structures among cells, positioning itself as a robust solution for cell population identification. Moreover, stDiff\'s enhancement outcomes closely mirror the actual ST data within the batch space. Across diverse spatial expression patterns, our model accurately reconstructs them, delineating distinct spatial boundaries. This highlights stDiff\'s capability to unify the observed and predicted segments of ST data for subsequent analysis. We anticipate that stDiff, with its innovative approach, will contribute to advancing ST imputation methodologies.
摘要:
空间转录组学(ST)已成为探索组织中基因表达的空间组织的有力工具。基于成像的方法,虽然在单细胞层面提供了优越的空间分辨率,在成像基因的数量或基因检测的灵敏度方面受到限制。用于增强ST的现有方法依赖于ST细胞与参考单细胞RNA测序(scRNA-seq)细胞之间的相似性。相比之下,我们引入stDiff,利用scRNA-seq数据中基因表达丰度之间的关系来增强ST。stDiff采用条件扩散模型,通过两个马尔可夫过程捕获scRNA-seq数据中的基因表达丰度关系:一个将噪声引入转录组学数据,另一个去噪以恢复它们。通过将原始ST数据合并到去噪过程中来预测ST的缺失部分。在我们对16个数据集的综合绩效评估中,利用多个聚类和相似性度量,stDiff以其在细胞之间保持拓扑结构的卓越能力而脱颖而出,将自己定位为细胞群识别的强大解决方案。此外,stDiff的增强结果与批处理空间中的实际ST数据非常相似。在不同的空间表达模式中,我们的模型准确地重建了它们,描绘不同的空间边界。这突出了stDiff将ST数据的观察和预测段统一起来以供后续分析的能力。我们预计标准,凭借其创新的方法,将有助于推进ST段插补方法。
公众号