关键词: Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) artificial intelligence cervical dilation dystocia epidural anesthesia failure to progress in labor fetal descent labor disorders labor progression machine learning mixed-effects multifactor multivariable partogram prediction error rupture of membranes station

Mesh : Humans Female Pregnancy Labor Stage, First / physiology Adult Labor, Induced / methods Longitudinal Studies Machine Learning Cesarean Section / statistics & numerical data Cohort Studies Labor, Obstetric / physiology Time Factors Young Adult

来  源:   DOI:10.1016/j.ajog.2024.02.289   PDF(Pubmed)

Abstract:
The diagnosis of failure to progress, the most common indication for intrapartum cesarean delivery, is based on the assessment of cervical dilation and station over time. Labor curves serve as references for expected changes in dilation and fetal descent. The labor curves of Friedman, Zhang et al, and others are based on time alone and derived from mothers with spontaneous labor onset. However, labor induction is now common, and clinicians also consider other factors when assessing labor progress. Labor curves that consider the use of labor induction and other factors that influence labor progress have the potential to be more accurate and closer to clinical decision-making.
This study aimed to compare the prediction errors of labor curves based on a single factor (time) or multiple clinically relevant factors using two modeling methods: mixed-effects regression, a standard statistical method, and Gaussian processes, a machine learning method.
This was a longitudinal cohort study of changes in dilation and station based on data from 8022 births in nulliparous women with a live, singleton, vertex-presenting fetus ≥35 weeks of gestation with a vaginal delivery. New labor curves of dilation and station were generated with 10-fold cross-validation. External validation was performed using a geographically independent group. Model variables included time from the first examination in the 20 hours before delivery; dilation, effacement, and station recorded at the previous examination; cumulative contraction counts; and use of epidural anesthesia and labor induction. To assess model accuracy, differences between each model\'s predicted value and its corresponding observed value were calculated. These prediction errors were summarized using mean absolute error and root mean squared error statistics.
Dilation curves based on multiple parameters were more accurate than those derived from time alone. The mean absolute error of the multifactor methods was better (lower) than those of the single-factor methods (0.826 cm [95% confidence interval, 0.820-0.832] for the multifactor machine learning and 0.893 cm [95% confidence interval, 0.885-0.901] for the multifactor mixed-effects method and 2.122 cm [95% confidence interval, 2.108-2.136] for the single-factor methods; P<.0001 for both comparisons). The root mean squared errors of the multifactor methods were also better (lower) than those of the single-factor methods (1.126 cm [95% confidence interval, 1.118-1.133] for the machine learning [P<.0001] and 1.172 cm [95% confidence interval, 1.164-1.181] for the mixed-effects methods and 2.504 cm [95% confidence interval, 2.487-2.521] for the single-factor [P<.0001 for both comparisons]). The multifactor machine learning dilation models showed small but statistically significant improvements in accuracy compared to the mixed-effects regression models (P<.0001). The multifactor machine learning method produced a curve of descent with a mean absolute error of 0.512 cm (95% confidence interval, 0.509-0.515) and a root mean squared error of 0.660 cm (95% confidence interval, 0.655-0.666). External validation using independent data produced similar findings.
Cervical dilation models based on multiple clinically relevant parameters showed improved (lower) prediction errors compared to models based on time alone. The mean prediction errors were reduced by more than 50%. A more accurate assessment of departure from expected dilation and station may help clinicians optimize intrapartum management.
摘要:
背景:诊断为进展失败,最常见的剖宫产指征,是基于宫颈扩张和站随着时间的评估。分娩曲线可作为扩张和胎儿下降的预期变化的参考。弗里德曼的劳动曲线,Zhang等人和其他人是基于单独的时间,来自自发分娩的母亲。然而,引产现在很普遍,临床医生在评估分娩进展时也会考虑其他因素.考虑使用诱导和其他影响分娩进展的因素的分娩曲线有可能更准确,更接近临床决策。
目的:使用2种建模方法:混合效应回归,比较基于单因素(时间)或多个临床相关因素的劳动曲线的预测误差,一种标准的统计方法,和高斯过程,一种机器学习方法。
方法:这是一项关于扩张和体位变化的纵向队列研究,该研究基于8022例未分娩妇女的数据,单身人士,妊娠≥35周伴阴道分娩的胎儿顶点。通过10倍交叉验证,生成了新的扩张和站点劳动曲线。使用地理上独立的组进行外部验证。模型变量包括从交付前20小时的第一次检查开始的时间;膨胀,在先前检查中记录的消退和位置;累积收缩计数;以及硬膜外麻醉和引产的使用。要评估模型准确性,我们计算了每个模型的预测值与其相应的观察值之间的差异。使用平均绝对误差和均方根误差统计来总结这些预测误差。
结果:(1)基于多个参数的扩张曲线比单独从时间得出的扩张曲线更准确。(2)多因素方法的平均绝对误差优于(低于)单因素方法[多因素机器学习法0.826cm(95%CI,0.820-0.832),多因素混合效应法0.893cm(95%CI,0.885-0.901),单因素法2.122cm(95%CI,2.108-2.136);两者比较P<0.0001]。(3)多因素方法的均方根误差也优于(低于)单因素方法的均方根误差[机器学习为1.126cm(95%CI,1.118-1.133)P<0.0001,混合效应为1.172cm(95%CI,1.164-1.181),单因素为2.504cm(95%CI,2.487-2.521);两者比较P<0.01]。(4)与混合效应回归模型相比,多因子机器学习膨胀模型在准确性上显示出较小但具有统计学意义的改进(P<0.0001)。(5)多因素机器学习方法产生的下降曲线平均绝对误差为0.512cm(95%CI,0.509-0.515),均方根误差为0.660cm(95%CI,0.655-0.666)。(6)使用独立数据的外部验证产生了类似的发现。
结论:(1)与仅基于时间的模型相比,基于多个临床相关参数的宫颈扩张模型显示出改善(更低)的预测误差;(2)平均预测误差降低了50%以上;(3)对预期扩张和定位偏离的更准确评估可能有助于临床医生优化产期管理。
公众号