OBJECTIVE: We investigate the global feature importance of different ML models. By providing information on the most relevant features, we can facilitate the use of ML in everyday medical practice.
METHODS: The data is provided by the cancer registry Rhineland-Palatinate gGmbH, Germany. It consists of numerical and categorical features of 1,944 patients with UBC. We retrospectively predict 2-year recurrence through ML models using Support Vector Machine, Gradient Boosting, and Artificial Neural Network. We then determine the global feature importance using performance-based Permutation Feature Importance (PFI) and variance-based Feature Importance Ranking Measure (FIRM).
RESULTS: We show reliable recurrence prediction of UBC with 82.02% to 83.89% F1-Score, 83.95% to 84.49% Precision, and an overall performance of 69.20% to 70.82% AUC on testing data, depending on the model. Gradient Boosting performs best among all black-box models with an average F1-Score (83.89%), AUC (70.82%), and Precision (83.95%). Furthermore, we show consistency across PFI and FIRM by identifying the same features as relevant across the different models. These features are exclusively therapeutic measures and are consistent with findings from both medical research and clinical trials.
CONCLUSIONS: We confirm the superiority of ML black-box models in predicting UBC recurrence compared to more traditional logistic regression. In addition, we present an approach that increases the explanatory power of black-box models by identifying the underlying influence of input features, thus facilitating the use of ML in clinical practice and therefore providing improved recurrence prediction through the application of black-box models.
目的:我们研究了不同ML模型的全局特征重要性。通过提供有关最相关功能的信息,我们可以促进ML在日常医疗实践中的使用。
方法:数据由癌症注册中心Rhineland-PalatinategGmbH提供,德国。它由1,944例UBC患者的数字和分类特征组成。我们使用支持向量机通过ML模型回顾性预测2年复发,梯度提升,和人工神经网络。然后,我们使用基于性能的置换特征重要性(PFI)和基于方差的特征重要性排名度量(FIRM)来确定全局特征重要性。
结果:我们显示了UBC的可靠复发预测,F1评分为82.02%至83.89%,83.95%至84.49%的精度,在测试数据上的总体表现为69.20%至70.82%的AUC,取决于模型。梯度提升在所有黑盒模型中表现最好,平均F1得分(83.89%),AUC(70.82%),和精密度(83.95%)。此外,我们通过识别不同模型中相关的相同特征来显示PFI和FIRM之间的一致性。这些特征仅是治疗措施,与医学研究和临床试验的结果一致。
结论:我们证实了与更传统的逻辑回归相比,ML黑盒模型在预测UBC复发方面的优越性。此外,我们提出了一种方法,通过识别输入特征的潜在影响来增加黑盒模型的解释力,从而促进ML在临床实践中的使用,并因此通过黑盒模型的应用提供改进的复发预测.