关键词: Automatic rejection Isolation Forest Long-Term EEG Machine learning Outlier detection

Mesh : Humans Signal Processing, Computer-Assisted Artifacts Retrospective Studies Algorithms Electroencephalography / methods

来  源:   DOI:10.1007/s11517-023-02961-5

Abstract:
Long-term electroencephalogram (Long-Term EEG) has the capacity to monitor over a long period, making it a valuable tool in medical institutions. However, due to the large volume of patient data, selecting clean data segments from raw Long-Term EEG for further analysis is an extremely time-consuming and labor-intensive task. Furthermore, the various actions of patients during recording make it difficult to use algorithms to denoise part of the EEG data, and thus lead to the rejection of these data. Therefore, tools for the quick rejection of heavily corrupted epochs in Long-Term EEG records are highly beneficial. In this paper, a new reliable and fast automatic artifact rejection method for Long-Term EEG based on Isolation Forest (IF) is proposed. Specifically, the IF algorithm is repetitively applied to detect outliers in the EEG data, and the boundary of inliers is promptly adjusted by using a statistical indicator to make the algorithm proceed in an iterative manner. The iteration is terminated when the distance metric between clean epochs and artifact-corrupted epochs remains unchanged. Six statistical indicators (i.e., min, max, median, mean, kurtosis, and skewness) are evaluated by setting them as centroid to adjust the boundary during iteration, and the proposed method is compared with several state-of-the-art methods on a retrospectively collected dataset. The experimental results indicate that utilizing the min value of data as the centroid yields the most optimal performance, and the proposed method is highly efficacious and reliable in the automatic artifact rejection of Long-Term EEG, as it significantly improves the overall data quality. Furthermore, the proposed method surpasses compared methods on most data segments with poor data quality, demonstrating its superior capacity to enhance the data quality of the heavily corrupted data. Besides, owing to the linear time complexity of IF, the proposed method is much faster than other methods, thus providing an advantage when dealing with extensive datasets.
摘要:
长期脑电图(长期脑电图)具有长期监测的能力,使其成为医疗机构中的宝贵工具。然而,由于大量的患者数据,从原始长期脑电图中选择干净的数据段进行进一步分析是一项极其耗时且费力的任务。此外,患者在记录过程中的各种动作使得很难使用算法去噪部分脑电图数据,从而导致对这些数据的拒绝。因此,用于快速拒绝长期EEG记录中严重损坏的时期的工具是非常有益的。在本文中,提出了一种新的基于隔离森林(IF)的长期脑电可靠,快速的自动伪影抑制方法。具体来说,重复应用IF算法来检测EEG数据中的异常值,并利用统计指标及时调整内点边界,使算法迭代进行。当干净历元和伪影破坏历元之间的距离度量保持不变时,终止迭代。六个统计指标(即,min,max,中位数,意思是,峰度,和偏度)通过将它们设置为质心以在迭代过程中调整边界来评估,并将所提出的方法与回顾性收集的数据集上的几种最先进的方法进行比较。实验结果表明,利用数据的最小值作为质心可以产生最佳的性能,所提出的方法在长期脑电图的自动伪影抑制中具有很高的有效性和可靠性,因为它显著提高了整体数据质量。此外,所提出的方法超过了大多数数据质量差的数据段的比较方法,展示了其卓越的能力,以提高严重损坏的数据的数据质量。此外,由于IF的线性时间复杂度,所提出的方法比其他方法快得多,从而在处理大量数据集时提供优势。
公众号