目的:本研究旨在开发一种基于集成机器学习(基于EML)的头颈部癌症患者接受质子放射治疗的放射性皮炎(RD)风险预测模型,与传统模型相比,目标是实现卓越的预测性能。
方法:对高雄长庚纪念医院57例接受调强质子治疗的头颈癌患者的数据进行分析。该研究纳入了11个临床参数和9个剂量学参数。皮尔逊相关性用于消除高度相关的变量,然后通过LASSO进行特征选择,以关注潜在的RD预测因子。模型训练涉及传统的逻辑回归(LR)和先进的集成方法,如随机森林和XGBoost,通过超参数调整进行了优化。
结果:特征选择确定了六个关键预测因子,包括吸烟史和具体剂量学参数。集成机器学习模型,特别是XGBoost,表现出卓越的性能,达到0.890的最高AUC。使用SHAP(SHapley加法扩张)值评估特征重要性,强调了各种临床和剂量学因素在预测RD中的相关性。
结论:研究证实EML方法,特别是XGBoost及其增强算法,提供卓越的预测准确性,增强的功能选择,与传统LR相比,改进了数据处理。虽然LR提供了更大的可解释性,EML的精度和更广泛的适用性使其更适合复杂的医学预测任务,比如预测放射性皮炎。鉴于这些优势,EML强烈建议在临床环境中进行进一步研究和应用。
OBJECTIVE: This study aims to develop an ensemble machine learning-based (EML-based) risk prediction model for radiation dermatitis (RD) in patients with head and neck cancer undergoing proton radiotherapy, with the goal of achieving superior predictive performance compared to traditional models.
METHODS: Data from 57 head and neck cancer patients treated with intensity-modulated proton therapy at Kaohsiung Chang Gung Memorial Hospital were analyzed. The study incorporated 11 clinical and 9 dosimetric parameters. Pearson\'s correlation was used to eliminate highly correlated variables, followed by feature selection via LASSO to focus on potential RD predictors. Model training involved traditional logistic regression (LR) and advanced ensemble methods such as Random Forest and XGBoost, which were optimized through hyperparameter tuning.
RESULTS: Feature selection identified six key predictors, including smoking history and specific dosimetric parameters. Ensemble machine learning models, particularly XGBoost, demonstrated superior performance, achieving the highest AUC of 0.890. Feature importance was assessed using SHAP (SHapley Additive exPlanations) values, which underscored the relevance of various clinical and dosimetric factors in predicting RD.
CONCLUSIONS: The study confirms that EML methods, especially XGBoost with its boosting algorithm, provide superior predictive accuracy, enhanced feature selection, and improved data handling compared to traditional LR. While LR offers greater interpretability, the precision and broader applicability of EML make it more suitable for complex medical prediction tasks, such as predicting radiation dermatitis. Given these advantages, EML is highly recommended for further research and application in clinical settings.