关键词: Gradient boosting decision Tree Intraoperative hemorrhage LGBoost Machine learning

Mesh : Humans Adult Middle Aged Aged China Inpatients Algorithms Databases, Factual Machine Learning

来  源:   DOI:10.1186/s12911-023-02253-w   PDF(Pubmed)

Abstract:
Prediction tools for various intraoperative bleeding events remain scarce. We aim to develop machine learning-based models and identify the most important predictors by real-world data from electronic medical records (EMRs).
An established database of surgical inpatients in Shanghai was utilized for analysis. A total of 51,173 inpatients were assessed for eligibility. 48,543 inpatients were obtained in the dataset and patients were divided into haemorrhage (N = 9728) and without-haemorrhage (N = 38,815) groups according to their bleeding during the procedure. Candidate predictors were selected from 27 variables, including sex (N = 48,543), age (N = 48,543), BMI (N = 48,543), renal disease (N = 26), heart disease (N = 1309), hypertension (N = 9579), diabetes (N = 4165), coagulopathy (N = 47), and other features. The models were constructed by 7 machine learning algorithms, i.e., light gradient boosting (LGB), extreme gradient boosting (XGB), cathepsin B (CatB), Ada-boosting of decision tree (AdaB), logistic regression (LR), long short-term memory (LSTM), and multilayer perception (MLP). An area under the receiver operating characteristic curve (AUC) was used to evaluate the model performance.
The mean age of the inpatients was 53 ± 17 years, and 57.5% were male. LGB showed the best predictive performance for intraoperative bleeding combining multiple indicators (AUC = 0.933, sensitivity = 0.87, specificity = 0.85, accuracy = 0.87) compared with XGB, CatB, AdaB, LR, MLP and LSTM. The three most important predictors identified by LGB were operative time, D-dimer (DD), and age.
We proposed LGB as the best Gradient Boosting Decision Tree (GBDT) algorithm for the evaluation of intraoperative bleeding. It is considered a simple and useful tool for predicting intraoperative bleeding in clinical settings. Operative time, DD, and age should receive attention.
摘要:
背景:各种术中出血事件的预测工具仍然缺乏。我们的目标是开发基于机器学习的模型,并通过电子病历(EMR)中的真实数据识别最重要的预测因子。
方法:利用上海建立的外科住院患者数据库进行分析。总共对51,173名住院患者进行了资格评估。在数据集中获得了48,543名住院患者,根据患者在手术过程中的出血情况,将患者分为出血组(N=9728)和无出血组(N=38,815)。从27个变量中选择候选预测因子,包括性别(N=48,543),年龄(N=48,543),BMI(N=48,543),肾脏疾病(N=26),心脏病(N=1309),高血压(N=9579),糖尿病(N=4165),凝血病(N=47),和其他功能。模型由7种机器学习算法构建,即,光梯度增强(LGB),极端梯度增强(XGB),组织蛋白酶B(CatB),决策树的Ada-Boosting(AdaB),逻辑回归(LR),长短期记忆(LSTM),和多层感知(MLP)。使用接收器工作特征曲线下面积(AUC)来评估模型性能。
结果:住院患者的平均年龄为53±17岁,57.5%为男性。与XGB相比,LGB结合多个指标(AUC=0.933,敏感性=0.87,特异性=0.85,准确性=0.87)显示出最佳的术中出血预测性能。CatB,AdaB,LR,MLP和LSTM。LGB确定的三个最重要的预测因素是手术时间,D-二聚体(DD),和年龄。
结论:我们提出LGB作为评估术中出血的最佳梯度提升决策树(GBDT)算法。它被认为是在临床环境中预测术中出血的简单而有用的工具。手术时间,DD,年龄应该受到关注。
公众号