关键词: conditional label detracking autoencoding generative adversarial network imputation tabular data

来  源:   DOI:10.3390/e26050402   PDF(Pubmed)

Abstract:
Due to various reasons, such as limitations in data collection and interruptions in network transmission, gathered data often contain missing values. Existing state-of-the-art generative adversarial imputation methods face three main issues: limited applicability, neglect of latent categorical information that could reflect relationships among samples, and an inability to balance local and global information. We propose a novel generative adversarial model named DTAE-CGAN that incorporates detracking autoencoding and conditional labels to address these issues. This enhances the network\'s ability to learn inter-sample correlations and makes full use of all data information in incomplete datasets, rather than learning random noise. We conducted experiments on six real datasets of varying sizes, comparing our method with four classic imputation baselines. The results demonstrate that our proposed model consistently exhibited superior imputation accuracy.
摘要:
由于种种原因,例如数据收集的限制和网络传输的中断,收集的数据通常包含缺失值。现有的最先进的生成对抗插补方法面临三个主要问题:适用性有限,忽略了可以反映样本之间关系的潜在分类信息,无法平衡本地和全球信息。我们提出了一种名为DTAE-CGAN的新型生成对抗模型,该模型结合了脱轨自动编码和条件标签来解决这些问题。这增强了网络学习样本间相关性的能力,并充分利用了不完整数据集中的所有数据信息,而不是学习随机噪声。我们在六个不同大小的真实数据集上进行了实验,将我们的方法与四个经典的归责基线进行比较。结果表明,我们提出的模型始终表现出优异的归因精度。
公众号