Feature Extraction

特征提取
  • 文章类型: Journal Article
    通过结合诸如标签和逻辑规则之类的先验信息来学习有区别的特征,最近的图像分类工作取得了一定的成功。然而,这些方法忽略了特征的可变性,导致特征不一致和模型参数更新的波动,这进一步降低了图像分类的准确性和模型的不稳定性。为了解决这个问题,本文提出了一种将结构先验驱动特征提取与梯度动量(SPGM)相结合的新方法,从一致的特征学习和精确的参数更新的角度来看,提高图像分类的准确性和稳定性。具体来说,SPGM利用结构先验驱动的特征提取(SPFE)方法来计算多级特征和原始图像的梯度,以构建结构信息,然后将其转化为先验知识,以驱动网络学习与原始图像一致的特征。此外,引入了梯度和动量(GMO)集成优化策略,根据梯度和动量之和的角度和范数,动态调整参数更新的方向和步长,启用精确的模型参数更新。在CIFAR10和CIFAR100数据集上进行的大量实验表明,SPGM方法显着降低了图像分类中的前1位错误率,提高分类性能,并优于最先进的方法。
    Recent image classification efforts have achieved certain success by incorporating prior information such as labels and logical rules to learn discriminative features. However, these methods overlook the variability of features, resulting in feature inconsistency and fluctuations in model parameter updates, which further contribute to decreased image classification accuracy and model instability. To address this issue, this paper proposes a novel method combining structural prior-driven feature extraction with gradient-momentum (SPGM), from the perspectives of consistent feature learning and precise parameter updates, to enhance the accuracy and stability of image classification. Specifically, SPGM leverages a structural prior-driven feature extraction (SPFE) approach to calculate gradients of multi-level features and original images to construct structural information, which is then transformed into prior knowledge to drive the network to learn features consistent with the original images. Additionally, an optimization strategy integrating gradients and momentum (GMO) is introduced, dynamically adjusting the direction and step size of parameter updates based on the angle and norm of the sum of gradients and momentum, enabling precise model parameter updates. Extensive experiments on CIFAR10 and CIFAR100 datasets demonstrate that the SPGM method significantly reduces the top-1 error rate in image classification, enhances the classification performance, and outperforms state-of-the-art methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    噪声标签会损害基于深度学习的监督图像分类性能,因为模型可能会过度拟合噪声并学习损坏的特征提取器。对于具有噪声标记数据的自然图像分类训练,具有对比自监督预训练权重的模型初始化已显示出减少特征损坏并提高分类性能。然而,没有作品探索:i)其他自我监督方法,例如借口基于任务的预训练,用嘈杂的标签影响学习,和ii)在嘈杂的标签设置中,仅针对医学图像的任何自我监督预训练方法。医学图像通常具有较小的数据集和微妙的类间变化,需要人类的专业知识,以确保正确的分类。因此,目前尚不清楚在诸如CIFAR之类的自然图像数据集中使用嘈杂标签改善学习的方法是否也有助于医学图像。在这项工作中,我们探索了对比和借口的基于任务的自监督预训练,以初始化具有自诱导噪声标签的两个医学数据集的深度学习分类模型的权重-NCT-CRC-HE-100K组织组织学图像和COVID-QU-Ex胸部X线图像.我们的结果表明,使用自监督学习获得的预训练权重初始化的模型可以有效地学习更好的特征,并提高对噪声标签的鲁棒性。
    Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based selfsupervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels-NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    输电线路走廊点云场景中目标对象的语义分割是电力线树屏障检测的关键步骤。大量的,无序分布,输电线路走廊场景中点云的非均匀性对特征提取提出了重大挑战。以往的研究往往忽视了空间信息的核心利用,限制了网络理解复杂几何形状的能力。为了克服这个限制,本文着眼于增强分割网络中空间几何信息的深度表达,并提出了一种称为BDF-Net的方法来改进RandLA-Net。对于每个输入的3D点云数据,BDF-Net首先通过空间信息编码块将相对坐标和相对距离信息编码为空间几何特征表示,以捕获点云数据的局部空间结构。随后,双线性池块通过利用其双线性相互作用能力有效地将点云的特征信息与空间几何表示相结合,从而学习更多的区别性局部特征描述符。全局特征提取块利用点位置与相对位置的比值捕获点云数据中的全局结构信息,从而增强网络的语义理解能力。为了验证BDF-Net的性能,本文构建了一个数据集,PPCD,针对输电线路走廊的点云场景进行了详细的实验。实验结果表明,BDF-Net在各种评估指标上实现了显著的性能提升,具体实现97.16%的OA,77.48%的mIoU,mAcc为87.6%,为3.03%,16.23%,比RandLA-Net高18.44%,分别。此外,与其他最新方法的比较也验证了BDF-Net在点云语义分割任务中的优越性。
    Semantic segmentation of target objects in power transmission line corridor point cloud scenes is a crucial step in powerline tree barrier detection. The massive quantity, disordered distribution, and non-uniformity of point clouds in power transmission line corridor scenes pose significant challenges for feature extraction. Previous studies have often overlooked the core utilization of spatial information, limiting the network\'s ability to understand complex geometric shapes. To overcome this limitation, this paper focuses on enhancing the deep expression of spatial geometric information in segmentation networks and proposes a method called BDF-Net to improve RandLA-Net. For each input 3D point cloud data, BDF-Net first encodes the relative coordinates and relative distance information into spatial geometric feature representations through the Spatial Information Encoding block to capture the local spatial structure of the point cloud data. Subsequently, the Bilinear Pooling block effectively combines the feature information of the point cloud with the spatial geometric representation by leveraging its bilinear interaction capability thus learning more discriminative local feature descriptors. The Global Feature Extraction block captures the global structure information in the point cloud data by using the ratio between the point position and the relative position, so as to enhance the semantic understanding ability of the network. In order to verify the performance of BDF-Net, this paper constructs a dataset, PPCD, for the point cloud scenario of transmission line corridors and conducts detailed experiments on it. The experimental results show that BDF-Net achieves significant performance improvements in various evaluation metrics, specifically achieving an OA of 97.16%, a mIoU of 77.48%, and a mAcc of 87.6%, which are 3.03%, 16.23%, and 18.44% higher than RandLA-Net, respectively. Moreover, comparisons with other state-of-the-art methods also verify the superiority of BDF-Net in point cloud semantic segmentation tasks.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    对于行业图像数据,提出了一种基于随机配置网络和多尺度特征提取的图像分类方法。使用深度2DSCN从不同尺度的图像中提取多尺度特征,并且多层的隐藏特征也被连接在一起以获得更多的信息特征。集成的特征被馈送到SCN中以学习分类器,该分类器提高了不同类别的识别率。在实验中,使用手写数字数据库和工业热轧带钢数据库,对比结果表明,该方法能有效提高分类精度。
    For industry image data, this paper proposes an image classification method based on stochastic configuration networks and multi-scale feature extraction. The multi-scale features are extracted from images of different scales using deep 2DSCN, and the hidden features of multiple layers are also connected together to obtain more informational features. The integrated features are fed into SCNs to learn a classifier which improves the recognition rate for different categories. In the experiments, a handwritten digit database and an industry hot-rolled steel strip database are used, and the comparison results demonstrate the proposed method can effectively improve the classification accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在粮食行业,鉴定种子纯度是一项至关重要的任务,因为它是评估种子质量的重要因素。对于水稻种子,此属性可以最大程度地减少其他品种对水稻产量的意外影响,营养成分,和价格。然而,在实践中,它们通常与其他品种的种子混合。本研究提出了一种使用混合机器学习算法自动识别特定水稻品种纯度的新方法。核心概念涉及利用深度学习架构从原始数据中提取相关特征,其次是应用机器学习算法进行分类。进行了一些实验,以通过实际实施来评估所提出的模型的性能。结果表明,新方法大大优于现有方法,证明了有效的水稻种子纯度识别系统的潜力。
    In the grain industry, identifying seed purity is a crucial task because it is an important factor in evaluating seed quality. For rice seeds, this attribute enables the minimization of unexpected influences of other varieties on rice yield, nutrient composition, and price. However, in practice, they are often mixed with seeds from other varieties. This study proposes a novel method for automatically identifying the purity of a specific rice variety using hybrid machine learning algorithms. The core concept involves leveraging deep learning architectures to extract pertinent features from raw data, followed by the application of machine learning algorithms for classification. Several experiments are conducted to evaluate the performance of the proposed model through practical implementation. The results demonstrate that the novel method substantially outperformed the existing methods, demonstrating the potential for effective rice seed purity identification systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在本文中,引入了一种新的音乐流派分类方法。所提出的方法涉及将音频信号转换为称为声谱的统一表示,使用增强的Rigdelet神经网络(RNN)从其中提取纹理特征。此外,RNN已使用部分强化效果优化器(IPREO)的改进版本进行了优化,有效地避免了局部优化并增强了RNN的泛化能力。GTZAN数据集已在实验中使用,以评估所提出的RNN/IPREO模型对音乐流派分类的有效性。结果表明,通过结合光谱质心的组合,精度达到了92%,梅尔谱图,和Mel频率倒谱系数(MFCC)作为特征。该性能显著优于K-均值(58%)和支持向量机(高达68%)。此外,RNN/IPREO模型超越了各种深度学习架构,如神经网络(65%),RNN(84%),CNN(88%),DNN(86%),VGG-16(91%),和ResNet-50(90%)。值得注意的是,RNN/IPREO模型能够获得与VGG-16、ResNet-50和RNN-LSTM等知名深度模型相当的结果。有时甚至超过他们的分数。这突出了其混合CNN-双向RNN设计与IPREO参数优化算法相结合的优势,用于提取复杂和顺序的听觉数据。
    In this paper, a new approach has been introduced for classifying the music genres. The proposed approach involves transforming an audio signal into a unified representation known as a sound spectrum, from which texture features have been extracted using an enhanced Rigdelet Neural Network (RNN). Additionally, the RNN has been optimized using an improved version of the partial reinforcement effect optimizer (IPREO) that effectively avoids local optima and enhances the RNN\'s generalization capability. The GTZAN dataset has been utilized in experiments to assess the effectiveness of the proposed RNN/IPREO model for music genre classification. The results show an impressive accuracy of 92 % by incorporating a combination of spectral centroid, Mel-spectrogram, and Mel-frequency cepstral coefficients (MFCCs) as features. This performance significantly outperformed K-Means (58 %) and Support Vector Machines (up to 68 %). Furthermore, the RNN/IPREO model outshined various deep learning architectures such as Neural Networks (65 %), RNNs (84 %), CNNs (88 %), DNNs (86 %), VGG-16 (91 %), and ResNet-50 (90 %). It is worth noting that the RNN/IPREO model was able to achieve comparable results to well-known deep models like VGG-16, ResNet-50, and RNN-LSTM, sometimes even surpassing their scores. This highlights the strength of its hybrid CNN-Bi-directional RNN design in conjunction with the IPREO parameter optimization algorithm for extracting intricate and sequential auditory data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    机器学习在医疗保健和其他高风险应用中的优先事项是使最终用户能够轻松解释个人预测。本文概述了可解释分类器和打开黑盒模型的方法的最新发展。
    A priority for machine learning in healthcare and other high stakes applications is to enable end-users to easily interpret individual predictions. This opinion piece outlines recent developments in interpretable classifiers and methods to open black box models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    深度学习已经成为一种从三维图像中自动提取特征的强大工具,为劳动密集型和可能有偏见的手动图像分割方法提供了一种有效的替代方法。然而,对最佳训练集大小的探索有限,包括评估通过数据增强进行的艺术扩展是否可以在更短的时间内获得一致的结果,以及这些好处在不同类型的特征中的一致性。在这项研究中,我们手动分割了来自Menardella属的50个浮游有孔虫标本,以确定从内部和外部结构产生准确的体积和形状数据所需的最少训练图像数量。结果揭示,深度学习模型随着大量训练图像的增加而改进,需要八个样本才能达到95%的准确率。此外,数据增强可以将网络准确性提高高达8.0%。值得注意的是,与外部结构相比,预测内部结构的体积和形状测量提出了更大的挑战,由于不同材料之间的低对比度差异和增加的几何复杂性。这些结果为不同特征的精确图像分割提供了对最佳训练集大小的新见解,并突出了数据增强增强从三维图像中提取多元特征的潜力。
    Deep learning has emerged as a robust tool for automating feature extraction from three-dimensional images, offering an efficient alternative to labour-intensive and potentially biased manual image segmentation methods. However, there has been limited exploration into the optimal training set sizes, including assessing whether artficial expansion by data augmentation can achieve consistent results in less time and how consistent these benefits are across different types of traits. In this study, we manually segmented 50 planktonic foraminifera specimens from the genus Menardella to determine the minimum number of training images required to produce accurate volumetric and shape data from internal and external structures. The results reveal unsurprisingly that deep learning models improve with a larger number of training images with eight specimens being required to achieve 95% accuracy. Furthermore, data augmentation can enhance network accuracy by up to 8.0%. Notably, predicting both volumetric and shape measurements for the internal structure poses a greater challenge compared with the external structure, owing to low contrast differences between different materials and increased geometric complexity. These results provide novel insight into optimal training set sizes for precise image segmentation of diverse traits and highlight the potential of data augmentation for enhancing multivariate feature extraction from three-dimensional images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    植物病害可以迅速传播,如果不及早发现,将导致严重的作物损失。通过准确识别患病植物,农民只能针对受灾地区进行治疗,减少所需的杀虫剂或杀真菌剂的数量,并最大限度地减少对环境的影响。西红柿是全世界最重要和最广泛消费的作物之一。影响作物产量数量和质量的主要因素是叶部病害。各种疾病都会影响番茄的生产,影响产量和质量。叶片图像的自动分类可以早期识别患病的植物,能够及时采取干预和控制措施。诊断和分类特定疾病的许多创造性方法已被广泛采用。手动方法是昂贵且劳动密集型的。没有农业专家的帮助,通过图像处理结合机器学习算法可以促进疾病检测。在这项研究中,利用新的特征提取方法,利用一致的多项式图像特征对番茄叶片中的病害进行检测,通过机器学习模型对植物病害进行精确求解和快速检测。本研究的方法基于:·预处理,特征提取,降维和分类模块。•使用一致多项式方法来提取通过分类器的纹理特征。•建议的纹理特征由两部分构成,即基于增强的术语,和纹理细节部分进行文本分析。•来自植物村图像数据集的番茄叶样品用于收集该模型的数据。使用SVM分类器检测番茄叶片图像的病害准确率为98.80%。除了降低财务损失,提出的特征提取方法可以帮助有效地管理植物病害,提高作物产量和粮食安全。
    Plant diseases can spread rapidly, leading to significant crop losses if not detected early. By accurately identifying diseased plants, farmers can target treatment only to the affected areas, reducing the number of pesticides or fungicides needed and minimizing environmental impact. Tomatoes are among the most significant and extensively consumed crops worldwide. The main factor affecting crop yield quantity and quality is leaf disease. Various diseases can affect tomato production, impacting both yield and quality. Automated classification of leaf images allows for the early identification of diseased plants, enabling prompt intervention and control measures. Many creative approaches to diagnosing and categorizing specific illnesses have been widely employed. The manual method is costly and labor-intensive. Without the assistance of an agricultural specialist, disease detection can be facilitated by image processing combined with machine learning algorithms. In this study, the diseases in tomato leaves will be detected using new feature extraction method using conformable polynomials image features for accurate solution and faster detection of plant diseases through a machine learning model. The methodology of this study based on:•Preprocessing, feature extraction, dimension reduction and classification modules.•Conformable polynomials method is used to extract the texture features which is passed classifier.•The proposed texture feature is constructed by two parts the enhanced based term, and the texture detail part for textual analysis.•The tomato leaf samples from the plant village image dataset were used to gather the data for this model. The disease detected are 98.80 % accurate for tomato leaf images using SVM classifier. In addition to lowering financial loss, the suggested feature extraction method can help manage plant diseases effectively, improving crop yield and food security.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    抗癌肽(ACP)的鉴定至关重要,特别是在基于肽的癌症治疗的发展中。诸如分裂氨基酸组成(SAAC)和伪氨基酸组成(PseAAC)的经典模型缺乏特征表示的并入。这些改进提高了ACP识别的预测准确性和效率。因此,这项研究的努力是提出和开发一种基于特征提取的高级框架。因此,为了在本文中实现该目的,我们提出了扩展二肽组合物(EDPC)框架。所提出的EDPC框架通过考虑局部序列环境信息并改革CD-HIT框架以去除噪声和冗余来扩展二肽组成。为了测量准确性,我们做了几个实验。这些实验是使用四种著名的机器学习(ML)算法进行的:支持向量机(SVM),决策树(DT)随机森林(RF),和K近邻(KNN)。为了进行比较,我们使用了准确性,特异性,灵敏度,精度,召回,和F1分数作为评价标准。使用统计显著性检验进一步评估了所提出的框架的可靠性。因此,提出的EDPC框架表现出比SAAC和PseAAC增强的性能,其中SVM模型提供了96的最高精度。6%,特异性显著增强,灵敏度,精度,和多个数据集的F1分数。由于结合了增强的特征表示以及结合了局部和全局序列简档,因此提出的EDPC实现了更高的分类性能。所提出的框架可以处理噪声并且还可以复制特征。这些伴随着广泛的特征表示。最后,我们提出的框架可用于ACP鉴定至关重要的临床应用.未来的工作将包括扩展到更多种类的数据集,结合三级结构信息,并使用深度学习技术来改进所提出的EDPC。
    The identification of anticancer peptides (ACPs) is crucial, especially in the development of peptide-based cancer therapy. The classical models such as Split Amino Acid Composition (SAAC) and Pseudo Amino Acid Composition (PseAAC) lack the incorporation of feature representation. These advancements improve the predictive accuracy and efficiency of ACP identification. Thus, the effort of this research is to propose and develop an advanced framework based on feature extraction. Thus, to achieve this objective herein we propose an Extended Dipeptide Composition (EDPC) framework. The proposed EDPC framework extends the dipeptide composition by considering the local sequence environment information and reforming the CD-HIT framework to remove noise and redundancy. To measure the accuracy, we have performed several experiments. These experiments were employed using four famous machine learning (ML) algorithms named; Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and K Nearest Neighbor (KNN). For comparisons, we have used accuracy, specificity, sensitivity, precision, recall, and F1-Score as evaluation criteria. The reliability of the proposed framework is further evaluated using statistical significance tests. As a result, the proposed EDPC framework exhibited enhanced performance than SAAC and PseAAC, where the SVM model delivered the highest accuracy of 96. 6% and significant enhancements in specificity, sensitivity, precision, and F1-score over multiple datasets. Due to the incorporation of enhanced feature representation and the incorporation of local and global sequence profiles proposed EDPC achieves higher classification performance. The proposed frameworks can deal with noise and also duplicating features. These are accompanied by a wide range of feature representations. Finally, our proposed framework can be used for clinical applications where ACP identification is essential. Future works will include extending to a larger variety of datasets, incorporating tertiary structural information, and using deep learning techniques to improve the proposed EDPC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号