Feature Extraction

特征提取
  • 文章类型: Journal Article
    噪声标签会损害基于深度学习的监督图像分类性能,因为模型可能会过度拟合噪声并学习损坏的特征提取器。对于具有噪声标记数据的自然图像分类训练,具有对比自监督预训练权重的模型初始化已显示出减少特征损坏并提高分类性能。然而,没有作品探索:i)其他自我监督方法,例如借口基于任务的预训练,用嘈杂的标签影响学习,和ii)在嘈杂的标签设置中,仅针对医学图像的任何自我监督预训练方法。医学图像通常具有较小的数据集和微妙的类间变化,需要人类的专业知识,以确保正确的分类。因此,目前尚不清楚在诸如CIFAR之类的自然图像数据集中使用嘈杂标签改善学习的方法是否也有助于医学图像。在这项工作中,我们探索了对比和借口的基于任务的自监督预训练,以初始化具有自诱导噪声标签的两个医学数据集的深度学习分类模型的权重-NCT-CRC-HE-100K组织组织学图像和COVID-QU-Ex胸部X线图像.我们的结果表明,使用自监督学习获得的预训练权重初始化的模型可以有效地学习更好的特征,并提高对噪声标签的鲁棒性。
    Noisy labels hurt deep learning-based supervised image classification performance as the models may overfit the noise and learn corrupted feature extractors. For natural image classification training with noisy labeled data, model initialization with contrastive self-supervised pretrained weights has shown to reduce feature corruption and improve classification performance. However, no works have explored: i) how other self-supervised approaches, such as pretext task-based pretraining, impact the learning with noisy label, and ii) any self-supervised pretraining methods alone for medical images in noisy label settings. Medical images often feature smaller datasets and subtle inter-class variations, requiring human expertise to ensure correct classification. Thus, it is not clear if the methods improving learning with noisy labels in natural image datasets such as CIFAR would also help with medical images. In this work, we explore contrastive and pretext task-based selfsupervised pretraining to initialize the weights of a deep learning classification model for two medical datasets with self-induced noisy labels-NCT-CRC-HE-100K tissue histological images and COVID-QU-Ex chest X-ray images. Our results show that models initialized with pretrained weights obtained from self-supervised learning can effectively learn better features and improve robustness against noisy labels.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    输电线路走廊点云场景中目标对象的语义分割是电力线树屏障检测的关键步骤。大量的,无序分布,输电线路走廊场景中点云的非均匀性对特征提取提出了重大挑战。以往的研究往往忽视了空间信息的核心利用,限制了网络理解复杂几何形状的能力。为了克服这个限制,本文着眼于增强分割网络中空间几何信息的深度表达,并提出了一种称为BDF-Net的方法来改进RandLA-Net。对于每个输入的3D点云数据,BDF-Net首先通过空间信息编码块将相对坐标和相对距离信息编码为空间几何特征表示,以捕获点云数据的局部空间结构。随后,双线性池块通过利用其双线性相互作用能力有效地将点云的特征信息与空间几何表示相结合,从而学习更多的区别性局部特征描述符。全局特征提取块利用点位置与相对位置的比值捕获点云数据中的全局结构信息,从而增强网络的语义理解能力。为了验证BDF-Net的性能,本文构建了一个数据集,PPCD,针对输电线路走廊的点云场景进行了详细的实验。实验结果表明,BDF-Net在各种评估指标上实现了显著的性能提升,具体实现97.16%的OA,77.48%的mIoU,mAcc为87.6%,为3.03%,16.23%,比RandLA-Net高18.44%,分别。此外,与其他最新方法的比较也验证了BDF-Net在点云语义分割任务中的优越性。
    Semantic segmentation of target objects in power transmission line corridor point cloud scenes is a crucial step in powerline tree barrier detection. The massive quantity, disordered distribution, and non-uniformity of point clouds in power transmission line corridor scenes pose significant challenges for feature extraction. Previous studies have often overlooked the core utilization of spatial information, limiting the network\'s ability to understand complex geometric shapes. To overcome this limitation, this paper focuses on enhancing the deep expression of spatial geometric information in segmentation networks and proposes a method called BDF-Net to improve RandLA-Net. For each input 3D point cloud data, BDF-Net first encodes the relative coordinates and relative distance information into spatial geometric feature representations through the Spatial Information Encoding block to capture the local spatial structure of the point cloud data. Subsequently, the Bilinear Pooling block effectively combines the feature information of the point cloud with the spatial geometric representation by leveraging its bilinear interaction capability thus learning more discriminative local feature descriptors. The Global Feature Extraction block captures the global structure information in the point cloud data by using the ratio between the point position and the relative position, so as to enhance the semantic understanding ability of the network. In order to verify the performance of BDF-Net, this paper constructs a dataset, PPCD, for the point cloud scenario of transmission line corridors and conducts detailed experiments on it. The experimental results show that BDF-Net achieves significant performance improvements in various evaluation metrics, specifically achieving an OA of 97.16%, a mIoU of 77.48%, and a mAcc of 87.6%, which are 3.03%, 16.23%, and 18.44% higher than RandLA-Net, respectively. Moreover, comparisons with other state-of-the-art methods also verify the superiority of BDF-Net in point cloud semantic segmentation tasks.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    对于行业图像数据,提出了一种基于随机配置网络和多尺度特征提取的图像分类方法。使用深度2DSCN从不同尺度的图像中提取多尺度特征,并且多层的隐藏特征也被连接在一起以获得更多的信息特征。集成的特征被馈送到SCN中以学习分类器,该分类器提高了不同类别的识别率。在实验中,使用手写数字数据库和工业热轧带钢数据库,对比结果表明,该方法能有效提高分类精度。
    For industry image data, this paper proposes an image classification method based on stochastic configuration networks and multi-scale feature extraction. The multi-scale features are extracted from images of different scales using deep 2DSCN, and the hidden features of multiple layers are also connected together to obtain more informational features. The integrated features are fed into SCNs to learn a classifier which improves the recognition rate for different categories. In the experiments, a handwritten digit database and an industry hot-rolled steel strip database are used, and the comparison results demonstrate the proposed method can effectively improve the classification accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在粮食行业,鉴定种子纯度是一项至关重要的任务,因为它是评估种子质量的重要因素。对于水稻种子,此属性可以最大程度地减少其他品种对水稻产量的意外影响,营养成分,和价格。然而,在实践中,它们通常与其他品种的种子混合。本研究提出了一种使用混合机器学习算法自动识别特定水稻品种纯度的新方法。核心概念涉及利用深度学习架构从原始数据中提取相关特征,其次是应用机器学习算法进行分类。进行了一些实验,以通过实际实施来评估所提出的模型的性能。结果表明,新方法大大优于现有方法,证明了有效的水稻种子纯度识别系统的潜力。
    In the grain industry, identifying seed purity is a crucial task because it is an important factor in evaluating seed quality. For rice seeds, this attribute enables the minimization of unexpected influences of other varieties on rice yield, nutrient composition, and price. However, in practice, they are often mixed with seeds from other varieties. This study proposes a novel method for automatically identifying the purity of a specific rice variety using hybrid machine learning algorithms. The core concept involves leveraging deep learning architectures to extract pertinent features from raw data, followed by the application of machine learning algorithms for classification. Several experiments are conducted to evaluate the performance of the proposed model through practical implementation. The results demonstrate that the novel method substantially outperformed the existing methods, demonstrating the potential for effective rice seed purity identification systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在本文中,引入了一种新的音乐流派分类方法。所提出的方法涉及将音频信号转换为称为声谱的统一表示,使用增强的Rigdelet神经网络(RNN)从其中提取纹理特征。此外,RNN已使用部分强化效果优化器(IPREO)的改进版本进行了优化,有效地避免了局部优化并增强了RNN的泛化能力。GTZAN数据集已在实验中使用,以评估所提出的RNN/IPREO模型对音乐流派分类的有效性。结果表明,通过结合光谱质心的组合,精度达到了92%,梅尔谱图,和Mel频率倒谱系数(MFCC)作为特征。该性能显著优于K-均值(58%)和支持向量机(高达68%)。此外,RNN/IPREO模型超越了各种深度学习架构,如神经网络(65%),RNN(84%),CNN(88%),DNN(86%),VGG-16(91%),和ResNet-50(90%)。值得注意的是,RNN/IPREO模型能够获得与VGG-16、ResNet-50和RNN-LSTM等知名深度模型相当的结果。有时甚至超过他们的分数。这突出了其混合CNN-双向RNN设计与IPREO参数优化算法相结合的优势,用于提取复杂和顺序的听觉数据。
    In this paper, a new approach has been introduced for classifying the music genres. The proposed approach involves transforming an audio signal into a unified representation known as a sound spectrum, from which texture features have been extracted using an enhanced Rigdelet Neural Network (RNN). Additionally, the RNN has been optimized using an improved version of the partial reinforcement effect optimizer (IPREO) that effectively avoids local optima and enhances the RNN\'s generalization capability. The GTZAN dataset has been utilized in experiments to assess the effectiveness of the proposed RNN/IPREO model for music genre classification. The results show an impressive accuracy of 92 % by incorporating a combination of spectral centroid, Mel-spectrogram, and Mel-frequency cepstral coefficients (MFCCs) as features. This performance significantly outperformed K-Means (58 %) and Support Vector Machines (up to 68 %). Furthermore, the RNN/IPREO model outshined various deep learning architectures such as Neural Networks (65 %), RNNs (84 %), CNNs (88 %), DNNs (86 %), VGG-16 (91 %), and ResNet-50 (90 %). It is worth noting that the RNN/IPREO model was able to achieve comparable results to well-known deep models like VGG-16, ResNet-50, and RNN-LSTM, sometimes even surpassing their scores. This highlights the strength of its hybrid CNN-Bi-directional RNN design in conjunction with the IPREO parameter optimization algorithm for extracting intricate and sequential auditory data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    机器学习在医疗保健和其他高风险应用中的优先事项是使最终用户能够轻松解释个人预测。本文概述了可解释分类器和打开黑盒模型的方法的最新发展。
    A priority for machine learning in healthcare and other high stakes applications is to enable end-users to easily interpret individual predictions. This opinion piece outlines recent developments in interpretable classifiers and methods to open black box models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    深度学习已经成为一种从三维图像中自动提取特征的强大工具,为劳动密集型和可能有偏见的手动图像分割方法提供了一种有效的替代方法。然而,对最佳训练集大小的探索有限,包括评估通过数据增强进行的艺术扩展是否可以在更短的时间内获得一致的结果,以及这些好处在不同类型的特征中的一致性。在这项研究中,我们手动分割了来自Menardella属的50个浮游有孔虫标本,以确定从内部和外部结构产生准确的体积和形状数据所需的最少训练图像数量。结果揭示,深度学习模型随着大量训练图像的增加而改进,需要八个样本才能达到95%的准确率。此外,数据增强可以将网络准确性提高高达8.0%。值得注意的是,与外部结构相比,预测内部结构的体积和形状测量提出了更大的挑战,由于不同材料之间的低对比度差异和增加的几何复杂性。这些结果为不同特征的精确图像分割提供了对最佳训练集大小的新见解,并突出了数据增强增强从三维图像中提取多元特征的潜力。
    Deep learning has emerged as a robust tool for automating feature extraction from three-dimensional images, offering an efficient alternative to labour-intensive and potentially biased manual image segmentation methods. However, there has been limited exploration into the optimal training set sizes, including assessing whether artficial expansion by data augmentation can achieve consistent results in less time and how consistent these benefits are across different types of traits. In this study, we manually segmented 50 planktonic foraminifera specimens from the genus Menardella to determine the minimum number of training images required to produce accurate volumetric and shape data from internal and external structures. The results reveal unsurprisingly that deep learning models improve with a larger number of training images with eight specimens being required to achieve 95% accuracy. Furthermore, data augmentation can enhance network accuracy by up to 8.0%. Notably, predicting both volumetric and shape measurements for the internal structure poses a greater challenge compared with the external structure, owing to low contrast differences between different materials and increased geometric complexity. These results provide novel insight into optimal training set sizes for precise image segmentation of diverse traits and highlight the potential of data augmentation for enhancing multivariate feature extraction from three-dimensional images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    植物病害可以迅速传播,如果不及早发现,将导致严重的作物损失。通过准确识别患病植物,农民只能针对受灾地区进行治疗,减少所需的杀虫剂或杀真菌剂的数量,并最大限度地减少对环境的影响。西红柿是全世界最重要和最广泛消费的作物之一。影响作物产量数量和质量的主要因素是叶部病害。各种疾病都会影响番茄的生产,影响产量和质量。叶片图像的自动分类可以早期识别患病的植物,能够及时采取干预和控制措施。诊断和分类特定疾病的许多创造性方法已被广泛采用。手动方法是昂贵且劳动密集型的。没有农业专家的帮助,通过图像处理结合机器学习算法可以促进疾病检测。在这项研究中,利用新的特征提取方法,利用一致的多项式图像特征对番茄叶片中的病害进行检测,通过机器学习模型对植物病害进行精确求解和快速检测。本研究的方法基于:·预处理,特征提取,降维和分类模块。•使用一致多项式方法来提取通过分类器的纹理特征。•建议的纹理特征由两部分构成,即基于增强的术语,和纹理细节部分进行文本分析。•来自植物村图像数据集的番茄叶样品用于收集该模型的数据。使用SVM分类器检测番茄叶片图像的病害准确率为98.80%。除了降低财务损失,提出的特征提取方法可以帮助有效地管理植物病害,提高作物产量和粮食安全。
    Plant diseases can spread rapidly, leading to significant crop losses if not detected early. By accurately identifying diseased plants, farmers can target treatment only to the affected areas, reducing the number of pesticides or fungicides needed and minimizing environmental impact. Tomatoes are among the most significant and extensively consumed crops worldwide. The main factor affecting crop yield quantity and quality is leaf disease. Various diseases can affect tomato production, impacting both yield and quality. Automated classification of leaf images allows for the early identification of diseased plants, enabling prompt intervention and control measures. Many creative approaches to diagnosing and categorizing specific illnesses have been widely employed. The manual method is costly and labor-intensive. Without the assistance of an agricultural specialist, disease detection can be facilitated by image processing combined with machine learning algorithms. In this study, the diseases in tomato leaves will be detected using new feature extraction method using conformable polynomials image features for accurate solution and faster detection of plant diseases through a machine learning model. The methodology of this study based on:•Preprocessing, feature extraction, dimension reduction and classification modules.•Conformable polynomials method is used to extract the texture features which is passed classifier.•The proposed texture feature is constructed by two parts the enhanced based term, and the texture detail part for textual analysis.•The tomato leaf samples from the plant village image dataset were used to gather the data for this model. The disease detected are 98.80 % accurate for tomato leaf images using SVM classifier. In addition to lowering financial loss, the suggested feature extraction method can help manage plant diseases effectively, improving crop yield and food security.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    抗癌肽(ACP)的鉴定至关重要,特别是在基于肽的癌症治疗的发展中。诸如分裂氨基酸组成(SAAC)和伪氨基酸组成(PseAAC)的经典模型缺乏特征表示的并入。这些改进提高了ACP识别的预测准确性和效率。因此,这项研究的努力是提出和开发一种基于特征提取的高级框架。因此,为了在本文中实现该目的,我们提出了扩展二肽组合物(EDPC)框架。所提出的EDPC框架通过考虑局部序列环境信息并改革CD-HIT框架以去除噪声和冗余来扩展二肽组成。为了测量准确性,我们做了几个实验。这些实验是使用四种著名的机器学习(ML)算法进行的:支持向量机(SVM),决策树(DT)随机森林(RF),和K近邻(KNN)。为了进行比较,我们使用了准确性,特异性,灵敏度,精度,召回,和F1分数作为评价标准。使用统计显著性检验进一步评估了所提出的框架的可靠性。因此,提出的EDPC框架表现出比SAAC和PseAAC增强的性能,其中SVM模型提供了96的最高精度。6%,特异性显著增强,灵敏度,精度,和多个数据集的F1分数。由于结合了增强的特征表示以及结合了局部和全局序列简档,因此提出的EDPC实现了更高的分类性能。所提出的框架可以处理噪声并且还可以复制特征。这些伴随着广泛的特征表示。最后,我们提出的框架可用于ACP鉴定至关重要的临床应用.未来的工作将包括扩展到更多种类的数据集,结合三级结构信息,并使用深度学习技术来改进所提出的EDPC。
    The identification of anticancer peptides (ACPs) is crucial, especially in the development of peptide-based cancer therapy. The classical models such as Split Amino Acid Composition (SAAC) and Pseudo Amino Acid Composition (PseAAC) lack the incorporation of feature representation. These advancements improve the predictive accuracy and efficiency of ACP identification. Thus, the effort of this research is to propose and develop an advanced framework based on feature extraction. Thus, to achieve this objective herein we propose an Extended Dipeptide Composition (EDPC) framework. The proposed EDPC framework extends the dipeptide composition by considering the local sequence environment information and reforming the CD-HIT framework to remove noise and redundancy. To measure the accuracy, we have performed several experiments. These experiments were employed using four famous machine learning (ML) algorithms named; Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and K Nearest Neighbor (KNN). For comparisons, we have used accuracy, specificity, sensitivity, precision, recall, and F1-Score as evaluation criteria. The reliability of the proposed framework is further evaluated using statistical significance tests. As a result, the proposed EDPC framework exhibited enhanced performance than SAAC and PseAAC, where the SVM model delivered the highest accuracy of 96. 6% and significant enhancements in specificity, sensitivity, precision, and F1-score over multiple datasets. Due to the incorporation of enhanced feature representation and the incorporation of local and global sequence profiles proposed EDPC achieves higher classification performance. The proposed frameworks can deal with noise and also duplicating features. These are accompanied by a wide range of feature representations. Finally, our proposed framework can be used for clinical applications where ACP identification is essential. Future works will include extending to a larger variety of datasets, incorporating tertiary structural information, and using deep learning techniques to improve the proposed EDPC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高光谱图像(HSI)分类是HSI应用领域的重要组成部分。由于HSI包含丰富的光谱信息,有效地提取深层表示特征是一大挑战。在现有方法中,尽管边缘数据增强用于增强边缘表示,在边缘也引入了大量的高频噪声。此外,不同光谱对于分类决策的重要性尚未得到强调。为应对上述挑战,我们提出了一种边缘感知和谱空间特征学习网络(ESSN)。ESSN包含边缘特征增强块和频谱空间特征提取块。首先,在边缘特征增强块中,图像的边缘被感知,不同光谱波段的边缘特征得到自适应加强。然后,在光谱空间特征提取块中,自适应调整不同光谱的权重,在此基础上提取更全面的深度表征特征。已经对三个公开的高光谱数据集进行了广泛的实验,实验结果表明,与现有技术的SOTA方法相比,该方法具有更高的准确性和抗干扰性。
    Hyperspectral image (HSI) classification is a vital part of the HSI application field. Since HSIs contain rich spectral information, it is a major challenge to effectively extract deep representation features. In existing methods, although edge data augmentation is used to strengthen the edge representation, a large amount of high-frequency noise is also introduced at the edges. In addition, the importance of different spectra for classification decisions has not been emphasized. Responding to the above challenges, we propose an edge-aware and spectral-spatial feature learning network (ESSN). ESSN contains an edge feature augment block and a spectral-spatial feature extraction block. Firstly, in the edge feature augment block, the edges of the image are sensed, and the edge features of different spectral bands are adaptively strengthened. Then, in the spectral-spatial feature extraction block, the weights of different spectra are adaptively adjusted, and more comprehensive depth representation features are extracted on this basis. Extensive experiments on three publicly available hyperspectral datasets have been conducted, and the experimental results indicate that the proposed method has higher accuracy and immunity to interference compared to state-of-the-art (SOTA) method.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号