Single-cell

单细胞
  • 文章类型: Journal Article
    背景:近年来,单细胞RNA测序(scRNA-seq)的引入使得能够以前所未有的粒度和处理速度分析细胞的转录组。应用该技术的实验结果是包含M基因和N细胞样品的聚集的mRNA表达计数的[公式:参见正文]矩阵。从这个矩阵中,科学家可以研究细胞蛋白质合成如何响应各种因素而变化,例如,疾病与非疾病状态对治疗方案的反应。这项技术的关键挑战是检测和准确记录低表达的基因。因此,低表达水平往往会被错过并记录为零-一个被称为dropout的事件。这使得低表达的基因与真正的零表达没有区别,并且与相同类型的细胞中存在的低表达不同。这个问题使得任何后续的下游分析变得困难。
    结果:为了解决这个问题,我们提出了一种使用共识聚类来测量细胞相似性的方法,并展示了一种有效且高效的算法,该算法利用这种新的相似性度量来估算scRNA-seq数据集中最可能的丢失事件。我们证明了我们的方法超过了现有插补方法的性能,同时引入了最少的新噪声,这是通过对具有已知小区身份的数据集上的性能特征进行聚类来衡量的。
    结论:ccImpute是一种有效的算法,可以纠正丢失事件,从而改善对scRNA-seq数据的下游分析。ccImpute在R中实现,可在https://github.com/khazum/ccImpute获得。
    BACKGROUND: In recent years, the introduction of single-cell RNA sequencing (scRNA-seq) has enabled the analysis of a cell\'s transcriptome at an unprecedented granularity and processing speed. The experimental outcome of applying this technology is a [Formula: see text] matrix containing aggregated mRNA expression counts of M genes and N cell samples. From this matrix, scientists can study how cell protein synthesis changes in response to various factors, for example, disease versus non-disease states in response to a treatment protocol. This technology\'s critical challenge is detecting and accurately recording lowly expressed genes. As a result, low expression levels tend to be missed and recorded as zero - an event known as dropout. This makes the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult.
    RESULTS: To address this problem, we propose an approach to measure cell similarity using consensus clustering and demonstrate an effective and efficient algorithm that takes advantage of this new similarity measure to impute the most probable dropout events in the scRNA-seq datasets. We demonstrate that our approach exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities.
    CONCLUSIONS: ccImpute is an effective algorithm to correct for dropout events and thus improve downstream analysis of scRNA-seq data. ccImpute is implemented in R and is available at https://github.com/khazum/ccImpute .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    单细胞DNA甲基化测序技术为研究表观遗传异质性带来了新的视角,支持需要基于单细胞甲基化谱对细胞进行聚类的计算方法。尽管已经开发了几种方法,他们中的大多数基于单一(DIS)相似性度量对细胞进行聚类,未能捕获完整的细胞异质性,并导致局部最优解。这里,我们介绍scMelody,它利用增强的基于共识的聚类模型来重建细胞间甲基化相似性模式,并使用来自多个基本相似性度量的杠杆信息来识别细胞亚群。此外,受益于重建的细胞间相似性度量,scMelody可以方便地利用聚类验证标准来确定最佳的聚类数量。对不同真实数据集的评估表明,scelody准确地概括了甲基化亚群,并在簇分区和簇数量方面优于现有方法。此外,当在各种合成数据集上对麦乐迪的聚类稳定性进行基准测试时,与现有方法相比,它实现了显著的聚类性能增益,并在广泛的细胞数量上稳健地保持了其聚类准确性,簇数和CpG脱落比例。最后,真实的案例研究证明了scelody评估已知细胞类型和发现新细胞簇的能力。
    Single-cell DNA methylation sequencing technology has brought new perspectives to investigate epigenetic heterogeneity, supporting a need for computational methods to cluster cells based on single-cell methylation profiles. Although several methods have been developed, most of them cluster cells based on single (dis)similarity measures, failing to capture complete cell heterogeneity and resulting in locally optimal solutions. Here, we present scMelody, which utilizes an enhanced consensus-based clustering model to reconstruct cell-to-cell methylation similarity patterns and identifies cell subpopulations with the leveraged information from multiple basic similarity measures. Besides, benefitted from the reconstructed cell-to-cell similarity measure, scMelody could conveniently leverage the clustering validation criteria to determine the optimal number of clusters. Assessments on distinct real datasets showed that scMelody accurately recapitulated methylation subpopulations and outperformed existing methods in terms of both cluster partitions and the number of clusters. Moreover, when benchmarking the clustering stability of scMelody on a variety of synthetic datasets, it achieved significant clustering performance gains over existing methods and robustly maintained its clustering accuracy over a wide range of number of cells, number of clusters and CpG dropout proportions. Finally, the real case studies demonstrated the capability of scMelody to assess known cell types and uncover novel cell clusters.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号