关键词: data integration data visualization multi-modal multi-view single-cell omics data soft alignment

来  源:   DOI:10.1093/bioinformatics/btae471   PDF(Pubmed)

Abstract:
CONCLUSIONS: One of the first steps in single-cell omics data analysis is visualization, which allows researchers to see how well-separated cell-types are from each other. When visualizing multiple datasets at once, data integration/batch correction methods are used to merge the datasets. While needed for downstream analyses, these methods modify features space (e.g. gene expression)/PCA space in order to mix cell-types between batches as well as possible. This obscures sample-specific features and breaks down local embedding structures that can be seen when a sample is embedded alone. Therefore, in order to improve in visual comparisons between large numbers of samples (e.g., multiple patients, omic modalities, different time points), we introduce Compound-SNE, which performs what we term a soft alignment of samples in embedding space. We show that Compound-SNE is able to align cell-types in embedding space across samples, while preserving local embedding structures from when samples are embedded independently.
METHODS: Python code for Compound-SNE is available for download at https://github.com/HaghverdiLab/Compound-SNE.
BACKGROUND: Available online. Provides algorithmic details and additional tests.
摘要:
结论:单细胞组学数据分析的第一步是可视化,这使研究人员能够看到细胞类型之间的分离程度。一次可视化多个数据集时,数据集合并使用数据集成/批量修正方法。虽然下游分析需要,这些方法修改特征空间(例如基因表达)/PCA空间,以便尽可能在批次之间混合细胞类型。这掩盖了样本特定的特征,并破坏了单独嵌入样本时可以看到的局部嵌入结构。因此,为了改善大量样本之间的视觉比较(例如,多名患者,总体模态,不同的时间点),我们介绍复合SNE,它执行我们所说的嵌入空间中样本的软对齐。我们证明Compound-SNE能够在样本的嵌入空间中排列细胞类型,同时保留样本独立嵌入时的局部嵌入结构。
方法:Compound-SNE的Python代码可从https://github.com/HaghverdiLab/Compound-SNE下载。
背景:在线提供。提供算法详细信息和其他测试。
公众号