背景:随着COVID-19在全球范围内的爆发和传播,有限的呼吸机无法满足ICU机械通气的需求。基于结构化数据的临床模型已被提出来合理化呼吸机分配,由于固定的领域和费力的标准化过程,其延展性通常较差。预训练模型和下游微调方法的出现允许针对不同任务学习大量非结构化临床文本。但是,大规模预训练模型和下游无目的网络的硬件要求导致在临床领域缺乏推广。
目的:在本研究中,提出了任务驱动预测模型的创新体系结构,并基于该体系结构开发了任务驱动的门控循环注意力池模型(TGRA-P)。TGRA-P可预测ICU机械通气患者的早期死亡风险,用于辅助临床医生诊断和决策。
方法:具体来说,建议使用特定于任务的嵌入模块来微调任务标签的嵌入,并将其保存为下游调用的静态文件。它更好地服务于任务并防止GPU过载。提出了门控递归注意单元(GRA),以进一步增强具有较少参数的文本序列前后信息的依赖性。此外,我们提出了一个残差最大池(RMP),通过合并注释的所有单词级特征来避免在常见文本分类任务中忽略单词进行预测。最后,我们使用全连接解码网络作为分类器来预测死亡风险.
结果:所提出的模型显示出非常有希望的结果,AUROC为0.8245±0.0096,AUPRC为0.7532±0.0115,准确度为0.7422±0.0028,F1评分为0.6612±0.0059,使用MIMIC-III数据集上机械通气患者的临床记录预测ICU90天死亡率,所有这些都比以前的研究好。此外,通过计算出的Cohend效应大小,在统计学上也验证了该模型与其他基线模型相比的优越性。
结论:实验结果表明,基于创新任务驱动的预后架构的TGRA-P获得了最先进的性能。在今后的工作中,我们将在提供的代码的基础上构建,并研究其对不同数据集的适用性。该模型平衡了性能和效率,不仅可以降低早期死亡风险预测的成本,还可以帮助医生及时进行临床干预和决策。通过合并临床医生难以利用的文本记录,该模型是对医生判断的宝贵补充,加强他们的决策过程。
BACKGROUND: With the outbreak and spread of COVID-19 worldwide, limited ventilators fail to meet the surging demand for mechanical ventilation in the ICU. Clinical models based on structured data that have been proposed to rationalize ventilator allocation often suffer from poor ductility due to fixed fields and laborious normalization processes. The advent of pre-trained models and downstream fine-tuning methods allows for learning large amounts of unstructured clinical text for different tasks. But the hardware requirements of large-scale pre-trained models and purposeless networks downstream have led to a lack of promotion in the clinical domain.
OBJECTIVE: In this study, an innovative architecture of a task-driven predictive model is proposed and a Task-driven Gated Recurrent Attention Pool model (TGRA-P) is developed based on the architecture. TGRA-P predicts early mortality risk from patients\' clinical notes on mechanical ventilation in the ICU, which is used to assist clinicians in diagnosis and decision-making.
METHODS: Specifically, a Task-Specific Embedding Module is proposed to fine-tune the embedding with task labels and save it as static files for downstream calls. It serves the task better and prevents GPU overload. The Gated Recurrent Attention Unit (GRA) is proposed to further enhance the dependency of the information preceding and following the text sequence with fewer parameters. In addition, we propose a Residual Max Pool (RMP) to avoid ignoring words in common text classification tasks by incorporating all word-level features of the notes for prediction. Finally, we use a fully connected decoding network as a classifier to predict the mortality risk.
RESULTS: The proposed model shows very promising results with an AUROC of 0.8245±0.0096, an AUPRC of 0.7532±0.0115, an accuracy of 0.7422±0.0028 and F1-score of 0.6612±0.0059 for 90-day mortality prediction using clinical notes of ICU mechanically ventilated patients on the MIMIC-III dataset, all of which are better than previous studies. Moreover, the superiority of the proposed model in comparison with other baseline models is also statistically validated through the calculated Cohen\'s d effect sizes.
CONCLUSIONS: The experimental results show that TGRA-P based on the innovative task-driven prognostic architecture obtains state-of-the-art performance. In future work, we will build upon the provided code and investigate its applicability to different datasets. The model balances performance and efficiency, not only reducing the cost of early mortality risk prediction but also assisting physicians in making timely clinical interventions and decisions. By incorporating textual records that are challenging for clinicians to utilize, the model serves as a valuable complement to physicians\' judgment, enhancing their decision-making process.