%0 Journal Article %T Evaluation of machine learning-based classification of clinical impairment and prediction of clinical worsening in multiple sclerosis. %A Noteboom S %A Seiler M %A Chien C %A Rane RP %A Barkhof F %A Strijbis EMM %A Paul F %A Schoonheim MM %A Ritter K %J J Neurol %V 0 %N 0 %D 2024 Jun 23 %M 38909341 %F 6.682 %R 10.1007/s00415-024-12507-w %X BACKGROUND: Robust predictive models of clinical impairment and worsening in multiple sclerosis (MS) are needed to identify patients at risk and optimize treatment strategies.
OBJECTIVE: To evaluate whether machine learning (ML) methods can classify clinical impairment and predict worsening in people with MS (pwMS) and, if so, which combination of clinical and magnetic resonance imaging (MRI) features and ML algorithm is optimal.
METHODS: We used baseline clinical and structural MRI data from two MS cohorts (Berlin: n = 125, Amsterdam: n = 330) to evaluate the capability of five ML models in classifying clinical impairment at baseline and predicting future clinical worsening over a follow-up of 2 and 5 years. Clinical worsening was defined by increases in the Expanded Disability Status Scale (EDSS), Timed 25-Foot Walk Test (T25FW), 9-Hole Peg Test (9HPT), or Symbol Digit Modalities Test (SDMT). Different combinations of clinical and volumetric MRI measures were systematically assessed in predicting clinical outcomes. ML models were evaluated using Monte Carlo cross-validation, area under the curve (AUC), and permutation testing to assess significance.
RESULTS: The ML models significantly determined clinical impairment at baseline for the Amsterdam cohort, but did not reach significance for predicting clinical worsening over a follow-up of 2 and 5 years. High disability (EDSS ≥ 4) was best determined by a support vector machine (SVM) classifier using clinical and global MRI volumes (AUC = 0.83 ± 0.07, p = 0.015). Impaired cognition (SDMT Z-score ≤ -1.5) was best determined by a SVM using regional MRI volumes (thalamus, ventricles, lesions, and hippocampus), reaching an AUC of 0.73 ± 0.04 (p = 0.008).
CONCLUSIONS: ML models could aid in classifying pwMS with clinical impairment and identify relevant biomarkers, but prediction of clinical worsening is an unmet need.