METHODS: This was a retrospective case-control study of patients with definite CP with/without EPI. Overall, 49 candidate predictor variables were utilized to train a Classification and Regression Tree (CART) model to rank all predictors and select a parsimonious set of predictors for EPI status. Five-fold cross-validation was used to assess generalizability, and the full CART model was compared with 4 additional predictive models. EPI misclassification rate (mRate) served as primary endpoint metric.
RESULTS: 274 patients with definite CP from 6 pancreatitis centers across the United States were included, of which 58 % had EPI based on predetermined criteria. The optimal CART decision tree included 10 variables. The mRate without/with 5-fold cross-validation of the CART was 0.153 (training error) and 0.314 (prediction error), and the area under the receiver operating characteristic curve was 0.889 and 0.682, respectively. Sensitivity and specificity without/with 5-fold cross-validation was 0.888/0.789 and 0.794/0.535, respectively. A trained second CART without pancreas imaging variables (n = 6), yielded 8 variables. Training error/prediction error was 0.190/0.351; sensitivity was 0.869/0.650, and specificity was 0.728/0.649, each without/with 5-fold cross-validation.
CONCLUSIONS: We developed two CART models that were integrated into one digital screening tool to assess for EPI in patients with definite CP and with two to six input variables needed for predicting EPI status.
方法:这是一项回顾性病例对照研究,对有或没有EPI的明确CP患者进行。总的来说,使用49个候选预测变量来训练分类和回归树(CART)模型,以对所有预测因子进行排名,并为EPI状态选择一组简约的预测因子。使用五倍交叉验证来评估泛化性,并将完整的CART模型与4个额外的预测模型进行比较。EPI误分类率(mRate)用作主要终点指标。
结果:纳入了来自美国6个胰腺炎中心的274例明确CP患者,其中58%的人根据预定标准进行了EPI。最优CART决策树包含10个变量。没有/具有CART5倍交叉验证的mRate为0.153(训练误差)和0.314(预测误差),受试者工作特性曲线下面积分别为0.889和0.682。无/5倍交叉验证的敏感性和特异性分别为0.888/0.789和0.794/0.535。没有胰腺成像变量的经训练的第二个CART(n=6),产生了8个变量。训练误差/预测误差为0.190/0.351;敏感性为0.869/0.650,特异性为0.728/0.649,均无/有5倍交叉验证。
结论:我们开发了两个CART模型,这些模型被整合到一个数字筛查工具中,以评估患有明确CP的患者的EPI,并且需要两到六个输入变量来预测EPI状态。