背景:IgA血管炎(IgAV)的儿童可能会出现肾脏并发症,这会影响他们的长期预后。本研究旨在建立一个机器学习模型来预测IgAV患儿的肾损害,并分析IgA血管炎伴肾炎(IgAVN)的危险因素。
方法:收集我院217例住院患者的50项临床指标。六种机器学习算法-Logistic回归,线性判别分析,K-最近的邻居,支持向量机,决策树,和随机森林-用于选择具有最高预测性能的模型。通过特征重要性排名开发了简化模型,并由46名患者的其他队列进行了验证。
结果:随机森林模型精度最高,精度,召回,F1得分,和曲线下的面积,值分别为0.91、0.98、0.70、0.79和0.94。根据重要性排名,前11个特征是抗链球菌溶血素O,皮质类固醇治疗,抗组胺治疗,嗜酸性粒细胞绝对计数,免疫球蛋白E,抗凝治疗,C反应蛋白,凝血酶原时间,发病年龄,D-二聚体,皮疹复发≥3次。使用这些功能的简化模型表现出最佳性能,准确率为84.2%,灵敏度为89.4%,外部验证的特异性为82.5%。最后,我们提供了一个基于简化模型的网络工具,其代码发布在https://github.com/mulanroo/IgAVN_Prediction上。
结论:基于随机森林算法的模型在预测IgAV患儿肾损害方面表现良好,为临床早期诊断和决策提供依据。
BACKGROUND: Children with IgA Vasculitis (IgAV) may develop renal complications, which can impact their long-term prognosis. This study aimed to build a machine learning model to predict renal damage in children with IgAV and analyze risk factors for IgA Vasculitis with Nephritis (IgAVN).
METHODS: 50 clinical indicators were collected from 217 inpatients at our hospital. Six machine learning algorithms-Logistic Regression, Linear Discriminant Analysis, K-Nearest Neighbor, Support Vector Machine, Decision Trees, and Random Forest-were utilized to select the model with the highest predictive performance. A simplified model was developed through feature importance ranking and validated by an additional cohort with 46 patients.
RESULTS: The random forest model had the highest accuracy, precision, recall, F1 score, and area under the curve, with values of 0.91, 0.98, 0.70, 0.79 and 0.94, respectively. The top 11 features according to the importance ranking were anti-streptolysin O, corticosteroids therapy, antihistamine therapy, absolute eosinophil count, immunoglobulin E, anticoagulant therapy, C-reactive protein, prothrombin time, age at onset, D-dimer, recurrence of rash ≥ 3 times. A simplified model using these features demonstrated optimal performance with an accuracy of 84.2%, a sensitivity of 89.4%, and a specificity of 82.5% in external validation. Finally, we provided a web tool based on the simplified model, whose code was published on https://github.com/mulanruo/IgAVN_Prediction .
CONCLUSIONS: The model based on the random forest algorithm demonstrates good performance in predicting renal damage in children with IgAV, providing a basis for early clinical diagnosis and decision-making.