METHODS: A total of 158 children were enrolled from Women and Children\'s Hospital, Qingdao University, and divided into 70-30% as the training sets and the test sets for modeling and validation studies. There are several classifiers are constructed for models including the random forest (RF), the logistic regression (LR), and the eXtreme Gradient Boosting (XGBoost). Data preprocessing is analyzed before applying the classifiers to modeling. To avoid the problem of overfitting, the 5-fold cross validation method was used throughout all the data.
RESULTS: The area under the curve (AUC) of the RF model was 0.925 according to the validation of the test set. The average accuracy was 0.930 (95% CI, 0.905 to 0.956). The AUC of the LG model was 0.888 and the average accuracy was 0.893 (95% CI, 0,837 to 0.950). The AUC of the XGBoost model was 0.879 and the average accuracy was 0.935 (95% CI, 0.891 to 0.980).
CONCLUSIONS: The RF algorithm was used in the present study to construct a prediction model for CAL effectively, with an accuracy of 0.930 and AUC of 0.925. The novel model established by ML may help guide clinicians in the initial decision to make a more aggressive initial anti-inflammatory therapy. Due to the limitations of external validation and regional population characteristics, additional research is required to initiate a further application in the clinic.
方法:妇女儿童医院共纳入158名儿童,青岛大学,并分为70-30%作为建模和验证研究的训练集和测试集。有几个分类器是为模型构建的,包括随机森林(RF),逻辑回归(LR),和极限梯度提升(XGBoost)。在将分类器应用于建模之前分析数据预处理。为了避免过度拟合的问题,所有数据均使用5倍交叉验证方法.
结果:根据测试集的验证,RF模型的曲线下面积(AUC)为0.925。平均准确度为0.930(95%CI,0.905~0.956)。LG模型的AUC为0.888,平均准确度为0.893(95%CI,0,837至0.950)。XGBoost模型的AUC为0.879,平均准确度为0.935(95%CI,0.891至0.980)。
结论:在本研究中使用RF算法来有效地构建CAL的预测模型,精度为0.930,AUC为0.925。ML建立的新模型可能有助于指导临床医生做出更积极的初始抗炎治疗。由于外部验证和区域人口特征的限制,需要更多的研究来启动进一步的临床应用。