关键词: contrastive learning frequency-domain augmentation representation learning self-supervised learning time-domain augmentation

来  源:   DOI:10.3389/frai.2024.1414352   PDF(Pubmed)

Abstract:
Time series is a typical data type in numerous domains; however, labeling large amounts of time series data can be costly and time-consuming. Learning effective representation from unlabeled time series data is a challenging task. Contrastive learning stands out as a promising method to acquire representations of unlabeled time series data. Therefore, we propose a self-supervised time-series representation learning framework via Time-Frequency Fusion Contrasting (TF-FC) to learn time-series representation from unlabeled data. Specifically, TF-FC combines time-domain augmentation with frequency-domain augmentation to generate the diverse samples. For time-domain augmentation, the raw time series data pass through the time-domain augmentation bank (such as jitter, scaling, permutation, and masking) and get time-domain augmentation data. For frequency-domain augmentation, first, the raw time series undergoes conversion into frequency domain data following Fast Fourier Transform (FFT) analysis. Then, the frequency data passes through the frequency-domain augmentation bank (such as low pass filter, remove frequency, add frequency, and phase shift) and gets frequency-domain augmentation data. The fusion method of time-domain augmentation data and frequency-domain augmentation data is kernel PCA, which is useful for extracting nonlinear features in high-dimensional spaces. By capturing both the time and frequency domains of the time series, the proposed approach is able to extract more informative features from the data, enhancing the model\'s capacity to distinguish between different time series. To verify the effectiveness of the TF-FC method, we conducted experiments on four time series domain datasets (i.e., SleepEEG, HAR, Gesture, and Epilepsy). Experimental results show that TF-FC significantly improves in recognition accuracy compared with other SOTA methods.
摘要:
时间序列是许多领域中的典型数据类型;但是,标记大量的时间序列数据可能是昂贵且耗时的。从未标记的时间序列数据中学习有效的表示是一项具有挑战性的任务。对比学习是一种很有前途的获取未标记时间序列数据表示的方法。因此,我们通过时频融合对比(TF-FC)提出了一种自监督的时间序列表示学习框架,以从未标记的数据中学习时间序列表示。具体来说,TF-FC将时域增强与频域增强相结合以生成不同的样本。对于时域增强,原始时间序列数据通过时域增强库(如抖动,缩放,排列,和掩码)并获取时域增强数据。对于频域增强,首先,在快速傅里叶变换(FFT)分析之后,原始时间序列经历到频域数据的转换。然后,频率数据通过频域增强组(如低通滤波器,移除频率,添加频率,和相移),并获得频域增强数据。时域增强数据和频域增强数据的融合方法是核PCA,这对于提取高维空间中的非线性特征很有用。通过捕获时间序列的时域和频域,所提出的方法能够从数据中提取更多的信息特征,增强模型区分不同时间序列的能力。为了验证TF-FC方法的有效性,我们在四个时间序列数据集上进行了实验(即,睡眠脑电图,HAR,手势,和癫痫)。实验结果表明,与其他SOTA方法相比,TF-FC的识别精度明显提高。
公众号