关键词: artificial intelligence convolutional neural network data augmentation delineation digital health electrocardiogram multi-centre study segmentation

来  源:   DOI:10.3389/fcvm.2024.1341786   PDF(Pubmed)

Abstract:
UNASSIGNED: Extracting beat-by-beat information from electrocardiograms (ECGs) is crucial for various downstream diagnostic tasks that rely on ECG-based measurements. However, these measurements can be expensive and time-consuming to produce, especially for long-term recordings. Traditional ECG detection and delineation methods, relying on classical signal processing algorithms such as those based on wavelet transforms, produce high-quality delineations but struggle to generalise to diverse ECG patterns. Machine learning (ML) techniques based on deep learning algorithms have emerged as promising alternatives, capable of achieving similar performance without handcrafted features or thresholds. However, supervised ML techniques require large annotated datasets for training, and existing datasets for ECG detection/delineation are limited in size and the range of pathological conditions they represent.
UNASSIGNED: This article addresses this challenge by introducing two key innovations. First, we develop a synthetic data generation scheme that probabilistically constructs unseen ECG traces from \"pools\" of fundamental segments extracted from existing databases. A set of rules guides the arrangement of these segments into coherent synthetic traces, while expert domain knowledge ensures the realism of the generated traces, increasing the input variability for training the model. Second, we propose two novel segmentation-based loss functions that encourage the accurate prediction of the number of independent ECG structures and promote tighter segmentation boundaries by focusing on a reduced number of samples.
UNASSIGNED: The proposed approach achieves remarkable performance, with a F 1 -score of 99.38% and delineation errors of 2.19 ± 17.73  ms and 4.45 ± 18.32  ms for ECG segment onsets and offsets across the P, QRS, and T waves. These results, aggregated from three diverse freely available databases (QT, LU, and Zhejiang), surpass current state-of-the-art detection and delineation approaches.
UNASSIGNED: Notably, the model demonstrated exceptional performance despite variations in lead configurations, sampling frequencies, and represented pathophysiology mechanisms, underscoring its robust generalisation capabilities. Real-world examples, featuring clinical data with various pathologies, illustrate the potential of our approach to streamline ECG analysis across different medical settings, fostered by releasing the codes as open source.
摘要:
从心电图(ECG)中提取逐次搏动信息对于依赖于基于ECG的测量的各种下游诊断任务至关重要。然而,这些测量可能是昂贵且耗时的,尤其是长期录音。传统的心电检测和勾画方法,依靠经典的信号处理算法,例如基于小波变换的算法,产生高质量的轮廓,但难以推广到不同的心电图模式。基于深度学习算法的机器学习(ML)技术已经成为有希望的替代方案,能够在没有手工制作的功能或阈值的情况下实现类似的性能。然而,有监督的机器学习技术需要大量带注释的数据集进行训练,和用于ECG检测/描绘的现有数据集的大小和它们所代表的病理状况的范围是有限的。
本文通过介绍两个关键创新来解决这一挑战。首先,我们开发了一种合成数据生成方案,该方案从现有数据库中提取的基本片段的\"池\"中概率地构建看不见的ECG迹线。一组规则将这些片段的排列引导成连贯的合成痕迹,而专家领域知识确保生成的痕迹的真实性,增加训练模型的输入变异性。第二,我们提出了两个新颖的基于分割的损失函数,它们鼓励准确预测独立ECG结构的数量,并通过关注减少的样本数来促进更紧密的分割边界.
所提出的方法实现了卓越的性能,F1分数为99.38%,心电图节段的起始和偏移在P上的描绘误差为2.19±17.73ms和4.45±18.32ms,QRS,T波。这些结果,从三个不同的免费数据库(QT,LU,和浙江),超越了当前最先进的检测和描绘方法。
值得注意的是,该模型表现出卓越的性能,尽管在引线配置的变化,采样频率,并代表病理生理机制,强调其强大的泛化能力。现实世界的例子,具有各种病理的临床数据,说明了我们在不同医疗环境中简化ECG分析的方法的潜力,通过释放代码作为开源来促进。
公众号