Genetic Heterogeneity

遗传异质性
  • 文章类型: Journal Article
    单细胞表观基因组数据以前所未有的速度持续增长,但它们的高维性和稀疏性等特点对下游分析提出了重大挑战。尽管深度学习模型——尤其是变分自编码器——已被广泛用于捕获低维特征嵌入,流行的高斯假设与真实数据有些不一致,这些模型往往很难从丰富的细胞图谱中纳入参考信息。这里我们提议城堡,基于矢量量化变分自动编码器框架的深度生成模型,用于提取可解释表征单细胞染色质可及性测序数据的离散潜在嵌入。与最先进的方法相比,我们验证了CASTLE的准确细胞类型识别和合理可视化的性能和鲁棒性。我们展示了CASTLE以弱监督或监督方式有效合并现有海量参考数据集的优势。我们进一步证明了CASTLE直观地提取细胞类型特定的特征光谱的能力,这些特征光谱定量地揭示了细胞异质性和生物学意义。
    Single-cell epigenomic data has been growing continuously at an unprecedented pace, but their characteristics such as high dimensionality and sparsity pose substantial challenges to downstream analysis. Although deep learning models-especially variational autoencoders-have been widely used to capture low-dimensional feature embeddings, the prevalent Gaussian assumption somewhat disagrees with real data, and these models tend to struggle to incorporate reference information from abundant cell atlases. Here we propose CASTLE, a deep generative model based on the vector-quantized variational autoencoder framework to extract discrete latent embeddings that interpretably characterize single-cell chromatin accessibility sequencing data. We validate the performance and robustness of CASTLE for accurate cell-type identification and reasonable visualization compared with state-of-the-art methods. We demonstrate the advantages of CASTLE for effective incorporation of existing massive reference datasets in a weakly supervised or supervised manner. We further demonstrate CASTLE\'s capacity for intuitively distilling cell-type-specific feature spectra that unveil cell heterogeneity and biological implications quantitatively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    单细胞RNA测序(scRNA-seq),研究肿瘤微环境(TME)的强大工具,不保存/提供关于组织形态和细胞相互作用的空间信息。要了解TME中邻近的不同蜂窝组件之间的串扰,我们进行了scRNA-seq和空间转录组(ST)分析,分析了来自3个结直肠癌(CRC)肿瘤-正常血液对的41,700个细胞.独立的scRNA-seq分析揭示了八个主要的细胞群体,包括B细胞,T细胞,单核细胞,NK细胞,上皮细胞,成纤维细胞,肥大细胞,内皮细胞。从上皮细胞中鉴定出恶性细胞后,我们观察到7种反映肿瘤异质性状态的恶性细胞亚型,包括肿瘤_CAV1,肿瘤_ATF3_JUN|FOS,肿瘤_ZEB2,肿瘤_VIM,肿瘤_WSB1,肿瘤_LXN,和肿瘤_PGM1。通过将scRNA-seq获得的细胞注释转移到ST点,我们注释了CRC患者冷冻切片中的四个区域,包括肿瘤,基质,免疫浸润,和结肠上皮区域。此外,我们观察到间质和肿瘤区域之间的强烈细胞间相互作用,这些区域在冷冻切片中非常近。特别是,推断一对配体和受体(C5AR1和RPS19)在基质和肿瘤区域的串扰中起关键作用。对于肿瘤区域,确定了TMSB4X高表达的典型特征,这可能是CRC的潜在标志物。发现基质区以VIM高表达为特征,这表明它在TME中培育了一个基质生态位。总的来说,我们研究中的单细胞和空间分析揭示了CRCTME中的肿瘤异质性和分子相互作用,这提供了对CRC进展的潜在机制的见解,并可能有助于开发靶向非肿瘤成分的抗癌疗法,例如CRC中的细胞外基质(ECM)。我们鉴定的典型基因可能有助于CRC的新分子亚型。
    Single cell RNA sequencing (scRNA-seq), a powerful tool for studying the tumor microenvironment (TME), does not preserve/provide spatial information on tissue morphology and cellular interactions. To understand the crosstalk between diverse cellular components in proximity in the TME, we performed scRNA-seq coupled with spatial transcriptomic (ST) assay to profile 41,700 cells from three colorectal cancer (CRC) tumor-normal-blood pairs. Standalone scRNA-seq analyses revealed eight major cell populations, including B cells, T cells, Monocytes, NK cells, Epithelial cells, Fibroblasts, Mast cells, Endothelial cells. After the identification of malignant cells from epithelial cells, we observed seven subtypes of malignant cells that reflect heterogeneous status in tumor, including tumor_CAV1, tumor_ATF3_JUN | FOS, tumor_ZEB2, tumor_VIM, tumor_WSB1, tumor_LXN, and tumor_PGM1. By transferring the cellular annotations obtained by scRNA-seq to ST spots, we annotated four regions in a cryosection from CRC patients, including tumor, stroma, immune infiltration, and colon epithelium regions. Furthermore, we observed intensive intercellular interactions between stroma and tumor regions which were extremely proximal in the cryosection. In particular, one pair of ligands and receptors (C5AR1 and RPS19) was inferred to play key roles in the crosstalk of stroma and tumor regions. For the tumor region, a typical feature of TMSB4X-high expression was identified, which could be a potential marker of CRC. The stroma region was found to be characterized by VIM-high expression, suggesting it fostered a stromal niche in the TME. Collectively, single cell and spatial analysis in our study reveal the tumor heterogeneity and molecular interactions in CRC TME, which provides insights into the mechanisms underlying CRC progression and may contribute to the development of anticancer therapies targeting on non-tumor components, such as the extracellular matrix (ECM) in CRC. The typical genes we identified may facilitate to new molecular subtypes of CRC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    多形性胶质母细胞瘤(GBM)包括以表型和转录异质性为特征的脑恶性肿瘤,认为这些肿瘤具有侵袭性,对治疗有抵抗力,不可避免地反复出现。然而,关于GBM基因组的空间组织是如何构成这种异质性及其影响的,人们知之甚少。这里,我们编制了一个由28个患者来源的胶质母细胞瘤干细胞样细胞系(GSCs)组成的队列,已知这些细胞系能够反映其肿瘤起源的特性;其中6个是来自同一患者的原发复发肿瘤对.我们生成并分析来自所有GSC的5kbp分辨率染色体构象捕获(Hi-C)数据,以系统地绘制数千个独立和复杂的结构变体(SV)以及由此产生的大量新样本。通过结合Hi-C,组蛋白修饰,和染色质折叠模拟的基因表达数据,我们解释了无处不在,不均匀,和特殊发生的新洛洛普通过形成新的增强子-启动子接触来维持肿瘤特异性转录程序。我们还展示了即使是中度复发的新样本也可能与患者特定的漏洞相关。一起,我们的数据为剖析GBM生物学和异质性提供了资源,以及告知治疗方法。
    Glioblastoma multiforme (GBM) encompasses brain malignancies marked by phenotypic and transcriptional heterogeneity thought to render these tumors aggressive, resistant to therapy, and inevitably recurrent. However, little is known about how the spatial organization of GBM genomes underlies this heterogeneity and its effects. Here, we compile a cohort of 28 patient-derived glioblastoma stem cell-like lines (GSCs) known to reflect the properties of their tumor-of-origin; six of these were primary-relapse tumor pairs from the same patient. We generate and analyze 5 kbp-resolution chromosome conformation capture (Hi-C) data from all GSCs to systematically map thousands of standalone and complex structural variants (SVs) and the multitude of neoloops arising as a result. By combining Hi-C, histone modification, and gene expression data with chromatin folding simulations, we explain how the pervasive, uneven, and idiosyncratic occurrence of neoloops sustains tumor-specific transcriptional programs via the formation of new enhancer-promoter contacts. We also show how even moderately recurrent neoloops can relate to patient-specific vulnerabilities. Together, our data provide a resource for dissecting GBM biology and heterogeneity, as well as for informing therapeutic approaches.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    通过大规模全基因组关联研究(GWAS)发现了与阿尔茨海默病(AD)相关的常见遗传变异和易感位点。通过代理GWAS(GWAX)和GWAS和GWAX的荟萃分析(GWAS+GWAX)。然而,由于AD易感位点的可重复性很低,AD的遗传度很低,这些AD遗传发现受到质疑。我们总结了过去10年的AD遗传发现,并在统计异质性的背景下对这些发现提供了新的解释。我们发现,在所有ADGWAS和GWAS+GWAX数据集上,只有17%的AD风险位点表现出具有全基因组显著性的P<5.00E-08的可重复性。我们强调,具有最大样本量的ADGWAS+GWAX未能识别最重要的信号,全基因组显著遗传变异的最大数量或最大遗传力。此外,我们在ADGWAS+GWAX数据集中发现了广泛的统计异质性,但不在ADGWAS数据集中。我们认为,统计异质性可能削弱了ADGWASGWAX的统计能力,并且可能有助于解释全基因组显著AD易感性基因座的低可重复性(17%)和随着样本量增加而降低的AD遗传力(40-2%)。重要的是,有证据支持以下观点:统计异质性的降低有助于全基因组显著遗传基因座的鉴定,并有助于AD遗传力的提高.总的来说,当前的ADGWAX和GWAS+GWAX调查结果应仔细评估,并需要进一步调查,ADGWAS+GWAX应采用多种荟萃分析方法,如随机效应逆方差加权荟萃分析,这是专门为统计异质性设计的。
    Common genetic variants and susceptibility loci associated with Alzheimer\'s disease (AD) have been discovered through large-scale genome-wide association studies (GWAS), GWAS by proxy (GWAX) and meta-analysis of GWAS and GWAX (GWAS+GWAX). However, due to the very low repeatability of AD susceptibility loci and the low heritability of AD, these AD genetic findings have been questioned. We summarize AD genetic findings from the past 10 years and provide a new interpretation of these findings in the context of statistical heterogeneity. We discovered that only 17% of AD risk loci demonstrated reproducibility with a genome-wide significance of P < 5.00E-08 across all AD GWAS and GWAS+GWAX datasets. We highlighted that the AD GWAS+GWAX with the largest sample size failed to identify the most significant signals, the maximum number of genome-wide significant genetic variants or maximum heritability. Additionally, we identified widespread statistical heterogeneity in AD GWAS+GWAX datasets, but not in AD GWAS datasets. We consider that statistical heterogeneity may have attenuated the statistical power in AD GWAS+GWAX and may contribute to explaining the low repeatability (17%) of genome-wide significant AD susceptibility loci and the decreased AD heritability (40-2%) as the sample size increased. Importantly, evidence supports the idea that a decrease in statistical heterogeneity facilitates the identification of genome-wide significant genetic loci and contributes to an increase in AD heritability. Collectively, current AD GWAX and GWAS+GWAX findings should be meticulously assessed and warrant additional investigation, and AD GWAS+GWAX should employ multiple meta-analysis methods, such as random-effects inverse variance-weighted meta-analysis, which is designed specifically for statistical heterogeneity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:多基因风险评分(PRS)汇总了人类基因组中与疾病相关的多种遗传变异的影响,可能有助于预测晚发性阿尔茨海默病(LOAD)。目前大多数关于阿尔茨海默病(AD)的PRS研究都是在高加索血统人群中进行的,虽然它很少用中文学习。
    目的:为了建立和检验中国PRS的有效性,并探索其种族异质性。
    方法:我们使用来自中国人群的发现(N=2012)和独立验证样本(N=1008)构建了PRS。评估了PRS与LOAD或脑脊液(CSF)生物标志物发病时的年龄之间的关联。我们还在具有CSF数据的独立复制队列中复制了PRS,并使用欧洲权重构建了替代PRS。
    方法:多中心遗传学研究。
    方法:本研究共纳入3020名受试者。
    方法:使用全基因组关联研究数据计算PRS,并通过测量受试者工作曲线下面积(AUC)单独评估性能(PRSnoAPOE)和其他预测因子(完整模型:LOAD〜PRSnoAPOE+APOE+性别+年龄)。
    结果:与仅含APOE的模型(61.0%)相比,完整模型的PRS达到了84.0%(95%CI=81.4-86.5)的最高AUC,pT<0.5。在PRSnoAPOE模型中,pT<5e-8的PRS的AUC为77.8%,完整模型中的81.5%,在欧洲权重模型的PRS中,仅在67.5%至75.1%之间。较高的PRS与发病年龄较早显著相关(P<0.001)。PRS在完整模型的复制队列中也表现良好(AUC=83.1%,95%CI=74.3-92.0)。CSF标志物Aβ42和Aβ42/Aβ40比值与PRS呈显著负相关,而p-Tau181则呈正相关。
    结论:这一发现表明,PRS揭示了遗传异质性,并且可以使用基础数据集和相同种族的验证来实现PRS对AD的更高预测准确性。有效的PRS模型具有临床潜力,可以预测在给定年龄和中国人群中CSF生物标志物水平异常的LOAD风险个体。
    BACKGROUND: The polygenic risk score (PRS) aggregates the effects of numerous genetic variants associated with a condition across the human genome and may help to predict late-onset Alzheimer\'s disease (LOAD). Most of the current PRS studies on Alzheimer\'s disease (AD) have been conducted in Caucasian ancestry populations, while it is less studied in Chinese.
    OBJECTIVE: To establish and examine the validity of Chinese PRS, and explore its racial heterogeneity.
    METHODS: We constructed a PRS using both discovery (N = 2012) and independent validation samples (N = 1008) from Chinese population. The associations between PRS and age at onset of LOAD or cerebrospinal fluid (CSF) biomarkers were assessed. We also replicated the PRS in an independent replication cohort with CSF data and constructed an alternative PRS using European weights.
    METHODS: Multi-center genetics study.
    METHODS: A total of 3020 subjects were included in the study.
    METHODS: PRS was calculated using genome-wide association studies data and evaluated the performance alone (PRSnoAPOE) and with other predictors (full model: LOAD ~ PRSnoAPOE + APOE+ sex + age) by measuring the area under the receiver operating curve (AUC).
    RESULTS: PRS of the full model achieved the highest AUC of 84.0% (95% CI = 81.4-86.5) with pT< 0.5, compared with the model containing APOE alone (61.0%). The AUC of PRS with pT<5e-8 was 77.8% in the PRSnoAPOE model, 81.5% in the full model, and only ranged from 67.5% to 75.1% in the PRS with the European weights model. A higher PRS was significantly associated with an earlier age at onset (P <0.001). The PRS also performed well in the replication cohort of the full model (AUC=83.1%, 95% CI = 74.3-92.0). The CSF biomarkers of Aβ42 and the ratio of Aβ42/Aβ40 were significantly inversely associated with the PRS, while p-Tau181 showed a positive association.
    CONCLUSIONS: This finding suggests that PRS reveal genetic heterogeneity and higher prediction accuracy of the PRS for AD can be achieved using a base dataset and validation within the same ethnicity. The effective PRS model has the clinical potential to predict individuals at risk of developing LOAD at a given age and with abnormal levels of CSF biomarkers in the Chinese population.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    胰腺上皮内瘤变(PanINs)是胰腺癌最常见的前病变,但是它们的体型小,在人类中难以接近,这使得它们对学习具有挑战性。严重的,号码,人类PanIN的尺寸和连通性在很大程度上仍然未知,排除对早期癌症发展的重要见解。这里,我们通过使用机器学习管道分析46个大致正常的人类胰腺大样本,以单细胞分辨率进行定量3D组织学重建,从而提供了人类PanIN的显微解剖学调查。为了阐明PanIN之间和内部的遗传关系,我们开发了一个工作流程,其中3D建模指导多区域显微切割以及靶向和全外显子组测序.从这些样本中,我们计算了每立方厘米13个PanIN的平均负担,并推断正常完整的成人胰腺含有数百个PanIN,几乎所有与致癌KRAS热点突变。我们发现大多数PanIN起源于具有不同体细胞突变谱的独立克隆。发现一些空间连续的PanIN包含多个KRAS突变;计算和原位分析表明,不同的KRAS突变位于这些肿瘤内的不同细胞亚群,表明它们的多克隆起源。PanIN的广泛多灶性和遗传异质性引发了有关驱动癌前病变并赋予人类胰腺不同进展风险的机制的重要问题。人类PanIN中分子改变的这种详细的3D基因组作图为胰腺癌的早期检测和合理拦截提供了经验基础。
    Pancreatic intraepithelial neoplasias (PanINs) are the most common precursors of pancreatic cancer, but their small size and inaccessibility in humans make them challenging to study1. Critically, the number, dimensions and connectivity of human PanINs remain largely unknown, precluding important insights into early cancer development. Here, we provide a microanatomical survey of human PanINs by analysing 46 large samples of grossly normal human pancreas with a machine-learning pipeline for quantitative 3D histological reconstruction at single-cell resolution. To elucidate genetic relationships between and within PanINs, we developed a workflow in which 3D modelling guides multi-region microdissection and targeted and whole-exome sequencing. From these samples, we calculated a mean burden of 13 PanINs per cm3 and extrapolated that the normal intact adult pancreas harbours hundreds of PanINs, almost all with oncogenic KRAS hotspot mutations. We found that most PanINs originate as independent clones with distinct somatic mutation profiles. Some spatially continuous PanINs were found to contain multiple KRAS mutations; computational and in situ analyses demonstrated that different KRAS mutations localize to distinct cell subpopulations within these neoplasms, indicating their polyclonal origins. The extensive multifocality and genetic heterogeneity of PanINs raises important questions about mechanisms that drive precancer initiation and confer differential progression risk in the human pancreas. This detailed 3D genomic mapping of molecular alterations in human PanINs provides an empirical foundation for early detection and rational interception of pancreatic cancer.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    脑瘫(CP)是儿童最常见的运动障碍。为了确定主要遗传变异在CP病因中的作用,我们对有CP临床表现的大规模队列进行了外显子组测序.该研究队列包括505名女孩和1,073名男孩。利用目前基因诊断的黄金标准,这1578名儿童中有387名(24.5%)接受了基因诊断。我们在219个与神经发育疾病相关的基因中鉴定出412个致病和可能致病(P/LP)变异。和59个P/LP拷贝数变体。出生时伴有围产期窒息的CP患儿的基因诊断率高于无窒息患儿(P=0.0033)。此外,有CP表现的儿童33例(8.5%,387个中的33个)的发现具有临床可行性。这些结果强调了对CP儿童进行早期基因检测的必要性,尤其是那些有围产期窒息等危险因素的人,以实现循证医学决策。
    Cerebral palsy (CP) is the most common motor disability in children. To ascertain the role of major genetic variants in the etiology of CP, we conducted exome sequencing on a large-scale cohort with clinical manifestations of CP. The study cohort comprised 505 girls and 1,073 boys. Utilizing the current gold standard in genetic diagnostics, 387 of these 1,578 children (24.5%) received genetic diagnoses. We identified 412 pathogenic and likely pathogenic (P/LP) variants across 219 genes associated with neurodevelopmental disorders, and 59 P/LP copy number variants. The genetic diagnostic rate of children with CP labeled at birth with perinatal asphyxia was higher than the rate in children without asphyxia (P = 0.0033). Also, 33 children with CP manifestations (8.5%, 33 of 387) had findings that were clinically actionable. These results highlight the need for early genetic testing in children with CP, especially those with risk factors like perinatal asphyxia, to enable evidence-based medical decision-making.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    膀胱癌(BLCA)的瘤内异质性(ITH)有助于治疗抵抗和免疫逃避影响临床预后。有助于BLCAITH产生的分子和细胞机制仍然难以捉摸。发现TM4SF1阳性癌症亚群(TPCS)可以在BLCA中产生ITH,通过整合单细胞图谱分析证明。BLCA所有阶段的表观基因组和转录组的广泛分析揭示了它们的进化轨迹。不同的祖先细胞产生了低级非侵入性和高级侵入性BLCA。表观基因组重编程导致BLCA中的转录异质性。在早期肿瘤发生期间,上皮-间质转化产生TPCS。TPCS具有干细胞样特性,并表现出转录可塑性,启动转录异质后代细胞谱系的发展。此外,肿瘤中TPCS的患病率与晚期癌症和不良预后有关。这项研究的结果表明,膀胱癌通过获得干细胞样的表观基因组景观与其环境相互作用,这可能会在没有额外遗传多样化的情况下产生ITH。
    Intratumor heterogeneity (ITH) of bladder cancer (BLCA) contributes to therapy resistance and immune evasion affecting clinical prognosis. The molecular and cellular mechanisms contributing to BLCA ITH generation remain elusive. It is found that a TM4SF1-positive cancer subpopulation (TPCS) can generate ITH in BLCA, evidenced by integrative single cell atlas analysis. Extensive profiling of the epigenome and transcriptome of all stages of BLCA revealed their evolutionary trajectories. Distinct ancestor cells gave rise to low-grade noninvasive and high-grade invasive BLCA. Epigenome reprograming led to transcriptional heterogeneity in BLCA. During early oncogenesis, epithelial-to-mesenchymal transition generated TPCS. TPCS has stem-cell-like properties and exhibited transcriptional plasticity, priming the development of transcriptionally heterogeneous descendent cell lineages. Moreover, TPCS prevalence in tumor is associated with advanced stage cancer and poor prognosis. The results of this study suggested that bladder cancer interacts with its environment by acquiring a stem cell-like epigenomic landscape, which might generate ITH without additional genetic diversification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    癌症异质性分析对于精准医学至关重要。大多数现有的异质性分析仅考虑单一类型的数据,而忽略了重要特征的可能稀疏性。在癌症临床实践中,有人建议两种类型的数据,病理成像和组学数据,通常被收集,可以产生分层的异构结构,其中由组学特征确定的细化的亚亚组结构可以嵌套在由成像特征确定的粗糙的亚组结构中。此外,稀疏性追求具有非凡的意义,对异质性分析更具挑战性,因为重要的特征在不同的子组中可能不相同,现有的异质性分析忽略了这一点。幸运的是,来自先前文献的丰富信息(例如,存放在PubMed中的那些)可用于本研究中的特征选择。从现有的分析中推进,在这项研究中,我们提出了一种新的稀疏层次异质性分析框架,它可以集成两种类型的特征并结合先验知识来改进特征选择。所提出的方法具有令人满意的统计特性和竞争性数值性能。TCGA真实数据分析证明了我们方法在分析数据异质性和稀疏性方面的实用价值。
    Cancer heterogeneity analysis is essential for precision medicine. Most of the existing heterogeneity analyses only consider a single type of data and ignore the possible sparsity of important features. In cancer clinical practice, it has been suggested that two types of data, pathological imaging and omics data, are commonly collected and can produce hierarchical heterogeneous structures, in which the refined sub-subgroup structure determined by omics features can be nested in the rough subgroup structure determined by the imaging features. Moreover, sparsity pursuit has extraordinary significance and is more challenging for heterogeneity analysis, because the important features may not be the same in different subgroups, which is ignored by the existing heterogeneity analyses. Fortunately, rich information from previous literature (for example, those deposited in PubMed) can be used to assist feature selection in the present study. Advancing from the existing analyses, in this study, we propose a novel sparse hierarchical heterogeneity analysis framework, which can integrate two types of features and incorporate prior knowledge to improve feature selection. The proposed approach has satisfactory statistical properties and competitive numerical performance. A TCGA real data analysis demonstrates the practical value of our approach in analyzing data heterogeneity and sparsity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    组织内遗传异质性对健康组织和癌组织都是普遍的。它来自整个发育和体内平衡过程中体细胞突变的随机积累。通过将群体遗传学理论和基因组信息相结合,遗传异质性可以用来推断体内的组织组织和动力学。然而,许多基本数量,例如,组织特异性干细胞的动力学仍然难以精确量化。这里,我们表明,单细胞和批量测序数据提供了潜在随机过程的不同方面的信息.大量衍生的变体等位基因频谱(VAF)显示,随着年龄的增长,健康食道上皮样品中的干细胞群从生长到恒定的转变。单细胞突变负荷分布允许突变和增殖率的样本大小独立测量。与发育过程中的推论相比,成人造血干细胞的突变率更高,提示额外的增殖无关效应。此外,单细胞衍生的VAF光谱包含有关组织特异性干细胞数量的信息。在造血系统中,我们发现大约2×105个造血干细胞,如果所有干细胞对称分裂。然而,与泊松分布随机突变模型相比,单细胞突变负荷分布过度分散.仅具有恒定速率的与时间相关的突变累积模型无法生成这种模式。至少需要一个额外的随机性来源。这些过程的可能候选者可能是偶尔的干细胞分裂爆发,可能是为了应对伤害,或通过环境暴露或细胞内在变异的非恒定突变率。
    Intra-tissue genetic heterogeneity is universal to both healthy and cancerous tissues. It emerges from the stochastic accumulation of somatic mutations throughout development and homeostasis. By combining population genetics theory and genomic information, genetic heterogeneity can be exploited to infer tissue organization and dynamics in vivo. However, many basic quantities, for example the dynamics of tissue-specific stem cells remain difficult to quantify precisely. Here, we show that single-cell and bulk sequencing data inform on different aspects of the underlying stochastic processes. Bulk-derived variant allele frequency spectra (VAF) show transitions from growing to constant stem cell populations with age in samples of healthy esophagus epithelium. Single-cell mutational burden distributions allow a sample size independent measure of mutation and proliferation rates. Mutation rates in adult hematopietic stem cells are higher compared to inferences during development, suggesting additional proliferation-independent effects. Furthermore, single-cell derived VAF spectra contain information on the number of tissue-specific stem cells. In hematopiesis, we find approximately 2 × 105 HSCs, if all stem cells divide symmetrically. However, the single-cell mutational burden distribution is over-dispersed compared to a model of Poisson distributed random mutations. A time-associated model of mutation accumulation with a constant rate alone cannot generate such a pattern. At least one additional source of stochasticity would be needed. Possible candidates for these processes may be occasional bursts of stem cell divisions, potentially in response to injury, or non-constant mutation rates either through environmental exposures or cell-intrinsic variation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号