关键词: SEER machine learning metastasis ovarian clear cell carcinoma

Mesh : Female Humans Bayes Theorem Algorithms Carcinoma, Ovarian Epithelial Adenocarcinoma, Clear Cell Machine Learning Ovarian Neoplasms

来  源:   DOI:10.1002/cam4.7161   PDF(Pubmed)

Abstract:
BACKGROUND: Ovarian clear cell carcinoma (OCCC) represents a subtype of ovarian epithelial carcinoma (OEC) known for its limited responsiveness to chemotherapy, and the onset of distant metastasis significantly impacts patient prognoses. This study aimed to identify potential risk factors contributing to the occurrence of distant metastasis in OCCC.
METHODS: Utilizing the Surveillance, Epidemiology, and End Results (SEER) database, we identified patients diagnosed with OCCC between 2004 and 2015. The most influential factors were selected through the application of Gaussian Naive Bayes (GNB) and Adaboost machine learning algorithms, employing a Venn test for further refinement. Subsequently, six machine learning (ML) techniques, namely XGBoost, LightGBM, Random Forest (RF), Adaptive Boosting (Adaboost), Support Vector Machine (SVM), and Multilayer Perceptron (MLP), were employed to construct predictive models for distant metastasis. Shapley Additive Interpretation (SHAP) analysis facilitated a visual interpretation for individual patient. Model validity was assessed using accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and the area under the receiver operating characteristic curve (AUC).
RESULTS: In the realm of predicting distant metastasis, the Random Forest (RF) model outperformed the other five machine learning algorithms. The RF model demonstrated accuracy, sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and AUC (95% CI) values of 0.792 (0.762-0.823), 0.904 (0.835-0.973), 0.759 (0.731-0.787), 0.221 (0.186-0.256), 0.974 (0.967-0.982), 0.353 (0.306-0.399), and 0.834 (0.696-0.967), respectively, surpassing the performance of other models. Additionally, the calibration curve\'s Brier Score (95%) for the RF model reached the minimum value of 0.06256 (0.05753-0.06759). SHAP analysis provided independent explanations, reaffirming the critical clinical factors associated with the risk of metastasis in OCCC patients.
CONCLUSIONS: This study successfully established a precise predictive model for OCCC patient metastasis using machine learning techniques, offering valuable support to clinicians in making informed clinical decisions.
摘要:
背景:卵巢透明细胞癌(OCCC)是卵巢上皮癌(OEC)的一种亚型,以其对化疗的反应有限而闻名,远处转移的发生会显著影响患者的预后。本研究旨在确定导致OCCC发生远处转移的潜在危险因素。
方法:利用监视,流行病学,和最终结果(SEER)数据库,我们确定了2004年至2015年间诊断为OCCC的患者.通过应用高斯朴素贝叶斯(GNB)和Adaboost机器学习算法选择影响最大的因素,采用维恩测试进一步完善。随后,六种机器学习(ML)技术,即XGBoost,LightGBM,随机森林(RF),自适应提升(Adaboost),支持向量机(SVM)和多层感知器(MLP),用于构建远处转移的预测模型。Shapley加法解释(SHAP)分析有助于对单个患者进行视觉解释。使用准确性评估模型有效性,灵敏度,特异性,正预测值,负预测值,F1得分,和受试者工作特征曲线下面积(AUC)。
结果:在预测远处转移的领域,随机森林(RF)模型优于其他五种机器学习算法。RF模型证明了准确性,灵敏度,特异性,正预测值,负预测值,F1得分,AUC(95%CI)值为0.792(0.762-0.823),0.904(0.835-0.973),0.759(0.731-0.787),0.221(0.186-0.256),0.974(0.967-0.982),0.353(0.306-0.399),和0.834(0.696-0.967),分别,超越其他型号的性能。此外,RF模型的校准曲线Brier评分(95%)达到最小值0.06256(0.05753-0.06759)。SHAP分析提供了独立的解释,重申与OCCC患者转移风险相关的关键临床因素。
结论:本研究使用机器学习技术成功建立了OCCC患者转移的精确预测模型,为临床医生做出明智的临床决策提供有价值的支持。
公众号