Label smoothing

标签平滑
  • 文章类型: Journal Article
    Mixup是一种有效的数据增强方法,它通过聚合不同原始样本的线性组合来生成新的增强样本。然而,如果原始样本中有噪声或异常特征,混合可能会将它们传播到增强样本,导致模型对这些异常值的过度敏感性。为了解决这个问题,本文提出了一种新的混合方法,称为AMPLIFY。该方法利用Transformer自身的注意机制来降低原始样本中的噪声和异常值对预测结果的影响,在不增加额外可训练参数的情况下,计算成本非常低,从而避免了句子混音等常见混音方法中资源消耗大的问题。实验结果表明,在较小的计算资源成本下,AMPLIFY在七个基准数据集上的文本分类任务中优于其他混合方法,为进一步提高基于注意力机制的预训练模型的性能提供新思路和新途径,比如BERT,艾伯特,罗伯塔,和GPT。我们的代码可以在https://github.com/kiwi-lilo/AMPLIFY获得。
    Mixup is an effective data augmentation method that generates new augmented samples by aggregating linear combinations of different original samples. However, if there are noises or aberrant features in the original samples, mixup may propagate them to the augmented samples, leading to over-sensitivity of the model to these outliers. To solve this problem, this paper proposes a new mixup method called AMPLIFY. This method uses the attention mechanism of Transformer itself to reduce the influence of noises and aberrant values in the original samples on the prediction results, without increasing additional trainable parameters, and the computational cost is very low, thereby avoiding the problem of high resource consumption in common mixup methods such as Sentence Mixup. The experimental results show that, under a smaller computational resource cost, AMPLIFY outperforms other mixup methods in text classification tasks on seven benchmark datasets, providing new ideas and new ways to further improve the performance of pre-trained models based on the attention mechanism, such as BERT, ALBERT, RoBERTa, and GPT. Our code can be obtained at https://github.com/kiwi-lilo/AMPLIFY.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    胸部X射线(CXR)是一种广泛使用的放射学方式,用于支持胸部疾病诊断。然而,现有的研究方法在有效集成多尺度CXR图像特征方面存在局限性,并且还受到不平衡数据集的阻碍。因此,迫切需要进一步发展胸部疾病的计算机辅助诊断(CAD)。为了应对这些挑战,我们提出了一种用于胸部疾病诊断的多分支残余注意力网络(MBRANet).MBRANet包括三个组件。首先,为了解决卷积层对空间和位置信息提取不足的问题,提出了一种包含坐标注意(CA)模块的新型残差结构,以提取多个尺度的特征。接下来,基于特征金字塔网络(FPN)的概念,我们以以下方式执行多尺度特征融合。第三,我们提出了一种新的多分支特征分类器(MFC)方法,它利用特定于类别的剩余注意力(CSRA)模块进行分类,而不是仅仅依赖于完全连接的层。此外,设计的BCEWithLabelSmoothing损失函数通过引入平滑因子,提高了泛化能力,缓解了类不平衡问题。我们在ChestX-Ray14上评估了MBRANet,CheXpert,MIMIC-CXR,和IUX射线数据集,分别达到平均AUC为0.841、0.895、0.805和0.745。在这些基准数据集上,我们的方法优于最先进的基线。
    Chest X-ray (CXR) is an extensively utilized radiological modality for supporting the diagnosis of chest diseases. However, existing research approaches suffer from limitations in effectively integrating multi-scale CXR image features and are also hindered by imbalanced datasets. Therefore, there is a pressing need for further advancement in computer-aided diagnosis (CAD) of thoracic diseases. To tackle these challenges, we propose a multi-branch residual attention network (MBRANet) for thoracic disease diagnosis. MBRANet comprises three components. Firstly, to address the issue of inadequate extraction of spatial and positional information by the convolutional layer, a novel residual structure incorporating a coordinate attention (CA) module is proposed to extract features at multiple scales. Next, based on the concept of a Feature Pyramid Network (FPN), we perform multi-scale feature fusion in the following manner. Thirdly, we propose a novel Multi-Branch Feature Classifier (MFC) approach, which leverages the class-specific residual attention (CSRA) module for classification instead of relying solely on the fully connected layer. In addition, the designed BCEWithLabelSmoothing loss function improves the generalization ability and mitigates the problem of class imbalance by introducing a smoothing factor. We evaluated MBRANet on the ChestX-Ray14, CheXpert, MIMIC-CXR, and IU X-Ray datasets and achieved average AUCs of 0.841, 0.895, 0.805, and 0.745, respectively. Our method outperformed state-of-the-art baselines on these benchmark datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    用软标签代替硬标签进行训练,可以有效提高深度学习模型的鲁棒性和泛化性。标签平滑通常在训练过程中提供均匀分布的软标签,而它没有考虑标签的语义差异。本文介绍了区分感知的标签平滑,一种自适应标签平滑方法,可以为迭代优化目标学习标签的适当分布。在这种方法中,采用阳性和阴性样本来提供双方的经验,并通过迭代学习方法提高了正则化和模型校准的性能。在5个文本分类数据集上的实验证明了该方法的有效性。
    Training with soft labels instead of hard labels can effectively improve the robustness and generalization of deep learning models. Label smoothing often provides uniformly distributed soft labels during the training process, whereas it does not take the semantic difference of labels into account. This article introduces discrimination-aware label smoothing, an adaptive label smoothing approach that learns appropriate distributions of labels for iterative optimization objectives. In this approach, positive and negative samples are employed to provide experience from both sides, and the performances of regularization and model calibration are improved through an iterative learning method. Experiments on five text classification datasets demonstrate the effectiveness of the proposed method.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    深度迁移学习已被广泛用于提高模型的通用性。在滚动轴承的跨域故障诊断问题中,大多数模型要求给定的数据具有相似的分布,这限制了模型的诊断效果和泛化。本文提出了一种深度重建传输卷积神经网络(DRTCNN),满足模型在跨域条件下的域适应性。首先,该模型使用深度重建卷积自动编码器进行特征提取和数据重建。通过共享参数和无监督训练,有效地利用目标域样本的结构信息来提取域不变特征。其次,引入一个新的子域对齐损失函数来对齐源域和目标域的子域分布,可以通过减少类内距离和增加类间距离来提高分类精度。此外,引入考虑样本可信度的标签平滑算法对模型分类器进行训练,避免错误标签对训练过程的影响。三个数据集用于验证模型的通用性,结果表明,该模型具有较高的精度和稳定性。
    Deep transfer learning has been widely used to improve the versatility of models. In the problem of cross-domain fault diagnosis in rolling bearings, most models require that the given data have a similar distribution, which limits the diagnostic effect and generalization of the model. This paper proposes a deep reconstruction transfer convolutional neural network (DRTCNN), which satisfies the domain adaptability of the model under cross-domain conditions. Firstly, the model uses a deep reconstruction convolutional automatic encoder for feature extraction and data reconstruction. Through sharing parameters and unsupervised training, the structural information of target domain samples is effectively used to extract domain-invariant features. Secondly, a new subdomain alignment loss function is introduced to align the subdomain distribution of the source domain and the target domain, which can improve the classification accuracy by reducing the intra-class distance and increasing the inter-class distance. In addition, a label smoothing algorithm considering the credibility of the sample is introduced to train the model classifier to avoid the impact of wrong labels on the training process. Three datasets are used to verify the versatility of the model, and the results show that the model has a high accuracy and stability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基于机器学习的算法在许多领域表现出令人印象深刻的性能;然而,他们继续受到某些限制。当使用与训练集相比具有不同分布的数据集来实现时,即使是复杂且精确的算法也经常做出错误的预测。分布异常(OOD)检测,它将具有不同分布的数据与训练集的数据区分开来,是克服这些限制并创建更可靠算法所必需的关键研究领域。OOD问题,特别是关于图像数据,已被广泛研究。然而,最近开发的OOD方法不满足OOD性能将随着分布中分类的准确性提高而增加的期望。我们的研究提出了跨多个模型和训练方法的OOD检测性能的综合研究,以验证这一现象。具体来说,我们通过新旧OOD检测方法探索了计算机视觉领域中流行的各种预训练模型。实验结果突出了现有OOD方法的性能差异。基于这些观察,我们引入了带逆softMax概率(TRIM)的修整秩,一种非常简单而有效的方法,使用新开发的训练方法对模型权重。由于所提出的方法具有良好的效果,因此可以作为增强OOD检测性能的潜在工具。TRIM的OOD性能与分布内准确性模型高度兼容,并且可以将提高分布内准确性的努力与区分OOD数据的能力联系起来。
    Machine learning-based algorithms demonstrate impressive performance across numerous fields; however, they continue to suffer from certain limitations. Even sophisticated and precise algorithms often make erroneous predictions when implemented with datasets having different distributions compared to the training set. Out-of-distribution (OOD) detection, which distinguishes data with different distributions from that of the training set, is a critical research area necessary to overcome these limitations and create more reliable algorithms. The OOD issue, particularly concerning image data, has been extensively studied. However, recently developed OOD methods do not fulfill the expectation that OOD performance will increase as the accuracy of in-distribution classification improves. Our research presents a comprehensive study on OOD detection performance across multiple models and training methodologies to verify this phenomenon. Specifically, we explore various pre-trained models popular in the computer vision field with both old and new OOD detection methods. The experimental results highlight the performance disparity in existing OOD methods. Based on these observations, we introduce Trimmed Rank with Inverse softMax probability (TRIM), a remarkably simple yet effective method for model weights with newly developed training methods. The proposed method could serve as a potential tool for enhancing OOD detection performance owing to its promising results. The OOD performance of TRIM is highly compatible with the in-distribution accuracy model and may bridge the efforts on improving in-distribution accuracy to the ability to distinguish OOD data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    多视图深度神经网络在3D形状分类任务中表现出优异的性能。然而,从多个视图数据聚合的全局特征通常缺乏内容信息和空间关系,这导致难以识别同一类别中的子类别之间的小差异。为了解决这个问题,在本文中,提出了一种新的多尺度扩张卷积神经网络MSDCNN,用于多视图细粒度3D形状分类。首先,由顺序视图捕获模块从围绕输入3D形状的12个视点渲染视图序列。然后,ResNeXt50的前22个卷积层用于提取每个视图的语义特征,通过12个输出特征图的元素最大运算得到全局混合特征图。此外,注意力扩张模块(ADM),它结合了四个串联的注意力扩张阻滞(ADB),旨在从全局混合特征图中提取更大的感受野特征,以增强视图之间的上下文信息。具体来说,每个ADB由注意力机制模块和具有不同扩张率的扩张卷积组成.此外,采用标签平滑的预测模块对特征进行分类,其中包含3×3卷积和自适应平均池化。我们的方法的性能在ModelNet10,ModelNet40和FG3D数据集上进行了实验验证。实验结果证明了所提出的MSDCNN框架用于3D形状细粒度分类的有效性和优越性。
    Multi-view deep neural networks have shown excellent performance on 3D shape classification tasks. However, global features aggregated from multiple views data often lack content information and spatial relationship, which leads to difficult identification the small variance among subcategories in the same category. To solve this problem, in this paper, a novel multiscale dilated convolution neural network termed as MSDCNN is proposed for multi-view fine-grained 3D shape classification. Firstly, a sequence of views are rendered from 12-viewpoints around the input 3D shape by the sequential view capturing module. Then, the first 22 convolution layers of ResNeXt50 is employed to extract the semantic features of each view, and a global mixed feature map is obtained through the element-wise maximum operation of the 12 output feature maps. Furthermore, attention dilated module (ADM), which combines four concatenated attention dilated block (ADB), is designed to extract larger receptive field features from global mixed feature map to enhance context information among the views. Specifically, each ADB is consisted by an attention mechanism module and a dilated convolution with different dilation rates. In addition, prediction module with label smoothing is proposed to classify features, which contains 3 × 3 convolution and adaptive average pooling. The performance of our method is validated experimentally on the ModelNet10, ModelNet40 and FG3D datasets. Experimental results demonstrate the effectiveness and superiority of the proposed MSDCNN framework for 3D shape fine-grained classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近关于半监督学习(SSL)的研究主要基于一致性正则化的方法,这依赖于特定领域的数据增强。伪标记是一种更一般的方法,没有这样的限制,但执行受到噪声训练的限制。我们结合了这两种方法,并专注于使用与领域无关的弱增强来生成伪标签。在这篇文章中,我们提出了ReFixMatch-LS,并将其应用于医学图像的分类。首先,我们通过标签平滑和一致的正则化减少了噪声人工标签的影响。然后,通过记录训练期间每个时期生成的高置信度伪标签,我们重用生成的伪标签来在随后的时期训练模型。ReFixMatch-LS有效地增加了伪标签的数量并提高了模型性能。我们在ISIC2018和ISIC2019挑战数据集中验证了ReFixMatch-LS对皮肤病变诊断的有效性。获得91.54%的AUC,93.68%,94.55%,和95.47%的四个比例的标签数据来自ISIC2018。
    Recent research on semi-supervised learning (SSL) is mainly based on the method of consistency regularization, which relies on domain-specific data augmentation. Pseudo-labeling is a more general method that has no such restrictions but performs limited by noisy training. We combine both approaches and focus on generating pseudo-labels using domain-independent weak augmentation. In this article, we propose ReFixMatch-LS and apply it to the classification of medical images. First, we reduce the impact of noisy artificial labels by label smoothing and consistent regularization. Then, by recording high-confidence pseudo-labels generated from each epoch during training, we reuse the generated pseudo-labels to train the model in the subsequent epochs. ReFixMatch-LS effectively increases the number of pseudo-labels and improves the model performance. We validate the effectiveness of ReFixMatch-LS on skin lesion diagnosis in the ISIC 2018 and ISIC 2019 challenge datasets, obtaining AUCs of 91.54%, 93.68%, 94.55%, and 95.47% on the four proportions of labeled data from ISIC 2018.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    头部姿态估计是计算机视觉中的重要任务之一,预测图像中头部的欧拉角。近年来,用于头部姿态估计的基于CNN的方法已经取得了优异的性能。他们的训练依赖于RGB图像,提供来自RGBD相机的面部标志或深度图像。然而,标记面部标志对于RGB图像中的大角度头部姿势是复杂的,和RGBD摄像机不适合户外场景。针对RGB图像中的头部姿态,提出了一种简单有效的标注方法。新颖性方法使用3D虚拟人头部来模拟RGB图像中的头部姿势。欧拉角可以根据3D虚拟头部的坐标变化来计算。然后,我们使用我们的注释方法创建一个数据集:2DHeadPose数据集,其中包含一组丰富的属性,尺寸,和角度。最后,我们提出高斯标签平滑来抑制注释噪声并反映类间关系。使用高斯标签平滑建立基线方法。实验证明,我们的标注方法,数据集,和高斯标签平滑非常有效。我们的基线方法超越了目前最先进的方法。注释工具,数据集,和源代码公开在https://github.com/youngnuaa/2DHeadPose。
    Head pose estimation is one of the essential tasks in computer vision, which predicts the Euler angles of the head in an image. In recent years, CNN-based methods for head pose estimation have achieved excellent performance. Their training relies on RGB images providing facial landmarks or depth images from RGBD cameras. However, labeling facial landmarks is complex for large angular head poses in RGB images, and RGBD cameras are unsuitable for outdoor scenes. We propose a simple and effective annotation method for the head pose in RGB images. The novelty method uses a 3D virtual human head to simulate the head pose in the RGB image. The Euler angle can be calculated from the change in coordinates of the 3D virtual head. We then create a dataset using our annotation method: 2DHeadPose dataset, which contains a rich set of attributes, dimensions, and angles. Finally, we propose Gaussian label smoothing to suppress annotation noises and reflect inter-class relationships. A baseline approach is established using Gaussian label smoothing. Experiments demonstrate that our annotation method, datasets, and Gaussian label smoothing are very effective. Our baseline approach surpasses most current state-of-the-art methods. The annotation tool, dataset, and source code are publicly available at https://github.com/youngnuaa/2DHeadPose.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近,许多关于手术视频分析的研究正在进行,因为它在许多医疗应用中越来越重要。特别是,能够识别当前的手术阶段是非常重要的,因为阶段信息可以在手术期间和之后以各种方式被利用。本文提出了一种高效的相位识别网络,叫做MomentNet,胆囊切除术内窥镜视频。与基于LSTM的网络不同,MomentNet基于多级时间卷积网络。此外,为了提高相位预测精度,该方法采用新的损失函数来补充一般的交叉熵损失函数。新的损失函数通过限制不期望的相变和防止过度分割,显着提高了相位识别网络的性能。此外,MomnetNet有效地应用了位置编码技术,通常应用于变压器架构中,到多级时间卷积网络。通过使用位置编码技术,MomentNet可以提供重要的时间上下文,导致更高的相位预测精度。此外,MomentNet应用标签平滑技术来抑制过拟合,并替换骨干网络进行特征提取,以进一步提高网络性能。因此,MomentNet在Choch80数据集的相位识别任务中实现了92.31%的准确率,比基准架构高4.55%。
    In recent times, many studies concerning surgical video analysis are being conducted due to its growing importance in many medical applications. In particular, it is very important to be able to recognize the current surgical phase because the phase information can be utilized in various ways both during and after surgery. This paper proposes an efficient phase recognition network, called MomentNet, for cholecystectomy endoscopic videos. Unlike LSTM-based network, MomentNet is based on a multi-stage temporal convolutional network. Besides, to improve the phase prediction accuracy, the proposed method adopts a new loss function to supplement the general cross entropy loss function. The new loss function significantly improves the performance of the phase recognition network by constraining un-desirable phase transition and preventing over-segmentation. In addition, MomnetNet effectively applies positional encoding techniques, which are commonly applied in transformer architectures, to the multi-stage temporal convolution network. By using the positional encoding techniques, MomentNet can provide important temporal context, resulting in higher phase prediction accuracy. Furthermore, the MomentNet applies label smoothing technique to suppress overfitting and replaces the backbone network for feature extraction to further improve the network performance. As a result, the MomentNet achieves 92.31% accuracy in the phase recognition task with the Cholec80 dataset, which is 4.55% higher than that of the baseline architecture.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人体免疫系统中的白细胞(WBC)可以抵御感染并保护身体免受外部危险物体的侵害。它们由中性粒细胞组成,嗜酸性粒细胞,嗜碱性粒细胞,单核细胞,和淋巴细胞,其中每个占不同的百分比,并执行特定的功能。传统上,用于量化特定类型的白细胞的临床实验室程序是全血细胞计数(CBC)测试的组成部分,这有助于监测人们的健康。随着深度学习的进步,使用各种算法,可以在更少的时间和高精度的血液胶片图像进行分类。本文基于CNN架构开发了许多最先进的深度学习模型及其变体。基于精度的模型性能比较研究,F1分数,召回,精度,参数数量,时间进行了,和DenseNet161被发现在其同行中表现出优异的性能。此外,先进的优化技术,如归一化,混合增强,DenseNet上还采用了标签平滑技术,以进一步完善其性能。
    White blood cells (WBCs) in the human immune system defend against infection and protect the body from external hazardous objects. They are comprised of neutrophils, eosinophils, basophils, monocytes, and lymphocytes, whereby each accounts for a distinct percentage and performs specific functions. Traditionally, the clinical laboratory procedure for quantifying the specific types of white blood cells is an integral part of a complete blood count (CBC) test, which aids in monitoring the health of people. With the advancements in deep learning, blood film images can be classified in less time and with high accuracy using various algorithms. This paper exploits a number of state-of-the-art deep learning models and their variations based on CNN architecture. A comparative study on model performance based on accuracy, F1-score, recall, precision, number of parameters, and time was conducted, and DenseNet161 was found to demonstrate a superior performance among its counterparts. In addition, advanced optimization techniques such as normalization, mixed-up augmentation, and label smoothing were also employed on DenseNet to further refine its performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号