关键词: MNIST classification covariate shift deep learning domain adaptation supervised learning MNIST classification covariate shift deep learning domain adaptation supervised learning

来  源:   DOI:10.3389/frai.2022.927676   PDF(Pubmed)

Abstract:
We propose a direct domain adaptation (DDA) approach to enrich the training of supervised neural networks on synthetic data by features from real-world data. The process involves a series of linear operations on the input features to the NN model, whether they are from the source or target distributions, as follows: (1) A cross-correlation of the input data (i.e., images) with a randomly picked sample pixel (or pixels) of all images from the input or the mean of all randomly picked sample pixel (or pixels) of all input images. (2) The convolution of the resulting data with the mean of the autocorrelated input images from the other domain. In the training stage, as expected, the input images are from the source distribution, and the mean of auto-correlated images are evaluated from the target distribution. In the inference/application stage, the input images are from the target distribution, and the mean of auto-correlated images are evaluated from the source distribution. The proposed method only manipulates the data from the source and target domains and does not explicitly interfere with the training workflow and network architecture. An application that includes training a convolutional neural network on the MNIST dataset and testing the network on the MNIST-M dataset achieves a 70% accuracy on the test data. A principal component analysis (PCA), as well as t-SNE, shows that the input features from the source and target domains, after the proposed direct transformations, share similar properties along the principal components as compared to the original MNIST and MNIST-M input features.
摘要:
我们提出了一种直接域自适应(DDA)方法,以通过来自现实世界数据的特征来丰富对合成数据的监督神经网络的训练。该过程涉及对NN模型的输入特征进行一系列线性操作,无论它们来自源或目标分布,如下:(1)输入数据的互相关(即,图像),其中包含来自输入的所有图像的随机选取的样本像素(或像素)或所有输入图像的所有随机选取的样本像素(或像素)的平均值。(2)所得数据与来自其他域的自相关输入图像的平均值的卷积。在训练阶段,正如预期的那样,输入图像来自源分布,并根据目标分布评估自相关图像的均值。在推理/应用阶段,输入图像来自目标分布,并从源分布评估自相关图像的平均值。所提出的方法仅操作来自源域和目标域的数据,并且不会显式地干扰训练工作流和网络架构。包括在MNIST数据集上训练卷积神经网络并在MNIST-M数据集上测试网络的应用在测试数据上实现了70%的准确度。主成分分析(PCA),以及T-SNE,显示来自源域和目标域的输入功能,在提议的直接转换之后,与原始MNIST和MNIST-M输入功能相比,沿主组件共享相似的属性。
公众号