关键词: biomarker discovery clustering microarrays molecular subtype pipeline subtyping benchmark triple-negative breast cancer

来  源:   DOI:10.3390/cancers14112571

Abstract:
Triple-negative breast cancer (TNBC) is a heterogeneous disease with diverse, often poor prognoses and treatment responses. In order to identify targetable biomarkers and guide personalized care, scientists have developed multiple molecular classification systems for TNBC based on transcriptomic profiling. However, there is no consensus on the molecular subtypes of TNBC, likely due to discrepancies in technical and computational methods used by different research groups. Here, we reassessed the major steps for TNBC subtyping, validated the reproducibility of established TNBC subtypes, and identified two more subtypes with a larger sample size. By comparing results from different workflows, we demonstrated the limitations of formalin-fixed, paraffin-embedded samples, as well as batch effect removal across microarray platforms. We also refined the usage of computational tools for TNBC subtyping. Furthermore, we integrated high-quality multi-institutional TNBC datasets (discovery set: n = 457; validation set: n = 165). Performing unsupervised clustering on the discovery and validation sets independently, we validated four previously discovered subtypes: luminal androgen receptor, mesenchymal, immunomodulatory, and basal-like immunosuppressed. Additionally, we identified two potential intermediate states of TNBC tumors based on their resemblance with more than one well-characterized subtype. In summary, we addressed the issues and limitations of previous TNBC subtyping through comprehensive analyses. Our results promote the rational design of future subtyping studies and provide new insights into TNBC patient stratification.
摘要:
三阴性乳腺癌(TNBC)是一种异质性疾病,通常预后和治疗反应不佳。为了识别有针对性的生物标志物并指导个性化护理,科学家已经开发了基于转录组学分析的TNBC的多种分子分类系统。然而,对TNBC的分子亚型没有共识,可能是由于不同研究小组使用的技术和计算方法存在差异。这里,我们重新评估了TNBC亚型的主要步骤,验证了已建立的TNBC亚型的可重复性,并确定了两个具有更大样本量的亚型。通过比较不同工作流的结果,我们证明了福尔马林固定的局限性,石蜡包埋样品,以及跨微阵列平台的批量效应去除。我们还改进了TNBC子类型的计算工具的使用。此外,我们整合了高质量的多机构TNBC数据集(发现集:n=457;验证集:n=165).独立对发现集和验证集执行无监督聚类,我们验证了以前发现的四种亚型:腔内雄激素受体,间充质,免疫调节,和基底样免疫抑制。此外,我们根据TNBC肿瘤与一种以上特征明确的亚型的相似性,确定了两种潜在的中间状态.总之,我们通过综合分析解决了以前TNBC亚型的问题和局限性。我们的结果促进了未来亚型研究的合理设计,并为TNBC患者分层提供了新的见解。
公众号