关键词: Ensemble learning Imbalance learning MRI morphometry White matter hyperintensities XGBoost algorithm

来  源:   DOI:10.1007/s10278-023-00958-y   PDF(Pubmed)

Abstract:
Leukoaraiosis (LA) is strongly associated with impaired cognition and increased dementia risk. Determining effective and robust methods of identifying LA patients with mild cognitive impairment (LA-MCI) is important for clinical intervention and disease monitoring. In this study, an ensemble learning method that combines multiple magnetic resonance imaging (MRI) morphological features is proposed to distinguish LA-MCI patients from LA patients lacking cognitive impairment (LA-nCI). Multiple comprehensive morphological measures (including gray matter volume (GMV), cortical thickness (CT), surface area (SA), cortical volume (CV), sulcus depth (SD), fractal dimension (FD), and gyrification index (GI)) are extracted from MRI to enrich model training on disease characterization information. Then, based on the general extreme gradient boosting (XGBoost) classifier, we leverage a weighted soft-voting ensemble framework to ensemble a data-level resampling method (Fusion + XGBoost) and an algorithm-level focal loss (FL)-improved XGBoost model (FL-XGBoost) to overcome class-imbalance learning problems and provide superior classification performance and stability. The baseline XGBoost model trained on an original imbalanced dataset had a balanced accuracy (Bacc) of 78.20%. The separate Fusion + XGBoost and FL-XGBoost models achieved Bacc scores of 80.53 and 81.25%, respectively, which are clear improvements (i.e., 2.33% and 3.05%, respectively). The fused model distinguishes LA-MCI from LA-nCI with an overall accuracy of 84.82%. Sensitivity and specificity were also well improved (85.50 and 84.14%, respectively). This improved model has the potential to facilitate the clinical diagnosis of LA-MCI.
摘要:
脑白质疏松症(LA)与认知功能受损和痴呆风险增加密切相关。确定识别患有轻度认知障碍(LA-MCI)的LA患者的有效和可靠的方法对于临床干预和疾病监测很重要。在这项研究中,提出了一种结合多种磁共振成像(MRI)形态学特征的集成学习方法,用于区分LA-MCI患者和缺乏认知障碍(LA-nCI)的LA患者.多种综合形态学测量(包括灰质体积(GMV),皮质厚度(CT),表面积(SA),皮质体积(CV),沟深度(SD),分形维数(FD),从MRI中提取出和旋化指数(GI)),以丰富疾病表征信息的模型训练。然后,基于通用极值梯度提升(XGBoost)分类器,我们利用加权软投票集成框架来集成数据级重采样方法(FusionXGBoost)和算法级焦点损失(FL)改进的XGBoost模型(FL-XGBoost),以克服类不平衡学习问题并提供卓越的分类性能和稳定性。在原始不平衡数据集上训练的基线XGBoost模型具有78.20%的平衡准确度(Bacc)。单独的Fusion+XGBoost和FL-XGBoost模型获得了80.53和81.25%的Bacc分数,分别,哪些是明显的改进(即,2.33%和3.05%,分别)。融合模型区分LA-MCI和LA-nCI的总体准确率为84.82%。敏感性和特异性也得到了很好的提高(85.50%和84.14%,分别)。这种改进的模型具有促进LA-MCI临床诊断的潜力。
公众号