Feature Extraction

特征提取
  • 文章类型: Journal Article
    水下目标信号典型特征的提取和优秀的识别算法是实现潜水员水声目标识别的关键。本文提出了一种针对潜水员信号的特征提取方法:频域多子带能量(FMSE),目的是实现被动声纳对潜水员水声目标的准确识别。存在或不存在目标的影响,不同数量的目标,不同的信噪比,根据不同条件下的实验数据,研究了该方法的不同检测距离,如水池和湖泊。发现与其他两种信号特征提取方法相比,FMSE方法具有最佳的鲁棒性和性能:mel频率倒谱系数滤波和gammatone频率倒谱系数滤波。结合常用的支持向量机识别算法,FMSE方法对蛙人水声目标的综合识别准确率达到94%以上。这表明FMSE方法适用于潜水员目标的水声识别。
    The extraction of typical features of underwater target signals and excellent recognition algorithms are the keys to achieving underwater acoustic target recognition of divers. This paper proposes a feature extraction method for diver signals: frequency-domain multi-sub-band energy (FMSE), aiming to achieve accurate recognition of diver underwater acoustic targets by passive sonar. The impact of the presence or absence of targets, different numbers of targets, different signal-to-noise ratios, and different detection distances on this method was studied based on experimental data under different conditions, such as water pools and lakes. It was found that the FMSE method has the best robustness and performance compared with two other signal feature extraction methods: mel frequency cepstral coefficient filtering and gammatone frequency cepstral coefficient filtering. Combined with the commonly used recognition algorithm of support vector machines, the FMSE method can achieve a comprehensive recognition accuracy of over 94% for frogman underwater acoustic targets. This indicates that the FMSE method is suitable for underwater acoustic recognition of diver targets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    旋转机械中转子部件脱落的可能性构成重大风险,需要开发早期和精确的故障诊断技术,以防止灾难性故障并降低维护成本。这项研究介绍了一种数据驱动的方法来检测转子部件脱落的开始,从而提高操作安全性和减少停机时间。利用频率分析,这项研究确定谐波振幅在转子振动数据作为即将发生的故障的关键指标。该方法采用主成分分析(PCA)来正交化并降低来自转子传感器的振动数据的维数。然后进行k折交叉验证,以选择重要特征的子集,保证检测算法的健壮性和泛化性。然后将这些特征集成到线性判别分析(LDA)模型中,作为诊断引擎,预测转子部件脱落的可能性。通过将其应用于16个工业压缩机和涡轮机,证明了该方法的有效性。证明其在提供及时的故障警告和提高运行可靠性方面的价值。
    The potential for rotor component shedding in rotating machinery poses significant risks, necessitating the development of an early and precise fault diagnosis technique to prevent catastrophic failures and reduce maintenance costs. This study introduces a data-driven approach to detect rotor component shedding at its inception, thereby enhancing operational safety and minimizing downtime. Utilizing frequency analysis, this research identifies harmonic amplitudes within rotor vibration data as key indicators of impending faults. The methodology employs principal component analysis (PCA) to orthogonalize and reduce the dimensionality of vibration data from rotor sensors, followed by k-fold cross-validation to select a subset of significant features, ensuring the detection algorithm\'s robustness and generalizability. These features are then integrated into a linear discriminant analysis (LDA) model, which serves as the diagnostic engine to predict the probability of rotor component shedding. The efficacy of the approach is demonstrated through its application to 16 industrial compressors and turbines, proving its value in providing timely fault warnings and enhancing operational reliability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:近年来,随着计算机辅助诊断系统的发展,机器学习在医学诊断和治疗中的使用显着增长,通常基于带注释的医学放射学图像。然而,缺乏大型注释图像数据集仍然是一个主要障碍,因为注释过程耗时且成本高昂。本研究旨在通过提出一种基于语义相似性来注释大型医学放射学图像数据库的自动化方法来克服这一挑战。
    结果:自动,无监督方法用于创建源自临床医院中心Rijeka的大型医学放射学图像注释数据集,克罗地亚。该管道是通过数据挖掘三种不同类型的医疗数据构建的:图像,DICOM元数据和叙事诊断。然后将最佳特征提取器集成到多模态表示中,然后对其进行聚类以创建自动管道,用于将1,337,926个医学图像的前体数据集标记为50个视觉上相似的图像集群。通过检查聚类的同质性和互信息来评估聚类的质量,考虑到解剖区域和模态表示。
    结论:结果表明,将所有三个数据源的嵌入融合在一起,为大规模医疗数据的无监督聚类任务提供了最佳结果,并导致了最简洁的聚类。因此,这项工作标志着朝着建立更大,更细粒度的医学放射学图像注释数据集迈出了第一步。
    BACKGROUND: The use of machine learning in medical diagnosis and treatment has grown significantly in recent years with the development of computer-aided diagnosis systems, often based on annotated medical radiology images. However, the lack of large annotated image datasets remains a major obstacle, as the annotation process is time-consuming and costly. This study aims to overcome this challenge by proposing an automated method for annotating a large database of medical radiology images based on their semantic similarity.
    RESULTS: An automated, unsupervised approach is used to create a large annotated dataset of medical radiology images originating from the Clinical Hospital Centre Rijeka, Croatia. The pipeline is built by data-mining three different types of medical data: images, DICOM metadata and narrative diagnoses. The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation.
    CONCLUSIONS: The results indicate that fusing the embeddings of all three data sources together provides the best results for the task of unsupervised clustering of large-scale medical data and leads to the most concise clusters. Hence, this work marks the initial step towards building a much larger and more fine-grained annotated dataset of medical radiology images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    使用抗炎肽(AIPs)作为炎症性疾病的替代治疗方法具有重要的研究意义。由于用实验方法识别AIP的成本高,难度大,在实验阶段之前通过计算方法发现和设计肽已成为有前途的技术。在这项研究中,我们介绍BertAIP,一种基于转换器(BERT)的双向编码器表示方法,用于直接从氨基酸序列预测AIP,而无需使用任何其他信息。BertAIP实现BERT模型来提取蛋白质的特征,并使用完全连接的前馈网络进行AIP分类。它是使用从最新的免疫表位数据库重建的AIP数据集构建和评估的。实验结果表明,BertAIP的准确率为0.751,马修斯相关系数为0.451,高于其他常用方法。独立测试的结果表明,BertAIP优于现有的AIP预测因子。此外,为了增强BertAIP的可解释性,我们探索并可视化了模型认为对AIP预测重要的氨基酸。我们相信本文提出的BertAIP将是用于大规模筛选和鉴定新型AIP的有用工具,用于与炎性疾病相关的药物开发和治疗研究。
    The use of anti-inflammatory peptides (AIPs) as an alternative therapeutic approach for inflammatory diseases holds great research significance. Due to the high cost and difficulty in identifying AIPs with experimental methods, the discovery and design of peptides by computational methods before the experimental stage have become promising technology. In this study, we present BertAIP, a bidirectional encoder representation from transformers (BERT)-based method for predicting AIPs directly from their amino acid sequence without using any other information. BertAIP implements a BERT model to extract features of a protein, and uses a fully connected feed-forward network for AIP classification. It was constructed and evaluated using the AIP datasets that were reconstructed from the latest Immune Epitope Database. The experimental results showed that BertAIP achieved an accuracy of 0.751 and a Matthews correlation coefficient of 0.451, which were higher than other commonly used methods. The results of the independent test suggested that BertAIP outperformed the existing AIP predictors. In addition, to enhance the interpretability of BertAIP, we explored and visualized the amino acids that the model considered important for AIP prediction. We believe that the BertAIP proposed herein will be a useful tool for large-scale screening and identifying novel AIPs for drug development and therapeutic research related to inflammatory diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    微调是迁移学习中的一项重要技术,在缺乏训练数据的任务中取得了显著的成功。然而,由于当源域和目标域之间的数据分布差异较大时,难以提取单源域微调的有效特征,我们提出了一种基于多源域的迁移学习框架,称为自适应多源域协作微调(AMCF)。AMCF利用多个源域模型进行协作微调,从而提高模型在目标任务中的特征提取能力。具体来说,AMCF采用自适应多源域层选择策略,为多个源域模型中的目标任务定制合适的层微调方案,旨在提取更有效的特征。此外,设计了一种新的多源域协同损失函数,便于各源域模型精确提取目标数据特征。同时,它致力于最小化各种源域模型之间的输出差异,增强了源域模型对目标数据的适应性。为了验证AMCF的有效性,它适用于迁移学习中常用的七个公共视觉分类数据集,并与最广泛使用的单源域微调方法进行了比较。实验结果表明,与现有的微调方法相比,我们的方法不仅提高了模型中特征提取的准确性,而且为目标任务提供了精确的层微调方案,从而显著提高微调性能。
    Fine-tuning is an important technique in transfer learning that has achieved significant success in tasks that lack training data. However, as it is difficult to extract effective features for single-source domain fine-tuning when the data distribution difference between the source and the target domain is large, we propose a transfer learning framework based on multi-source domain called adaptive multi-source domain collaborative fine-tuning (AMCF) to address this issue. AMCF utilizes multiple source domain models for collaborative fine-tuning, thereby improving the feature extraction capability of model in the target task. Specifically, AMCF employs an adaptive multi-source domain layer selection strategy to customize appropriate layer fine-tuning schemes for the target task among multiple source domain models, aiming to extract more efficient features. Furthermore, a novel multi-source domain collaborative loss function is designed to facilitate the precise extraction of target data features by each source domain model. Simultaneously, it works towards minimizing the output difference among various source domain models, thereby enhancing the adaptability of the source domain model to the target data. In order to validate the effectiveness of AMCF, it is applied to seven public visual classification datasets commonly used in transfer learning, and compared with the most widely used single-source domain fine-tuning methods. Experimental results demonstrate that, in comparison with the existing fine-tuning methods, our method not only enhances the accuracy of feature extraction in the model but also provides precise layer fine-tuning schemes for the target task, thereby significantly improving the fine-tuning performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阅读障碍是一种神经系统疾病,影响个人的语言处理能力。早期护理和干预可以帮助阅读障碍者在学术和社会上取得成功。深度学习(DL)方法的最新发展促使研究人员建立阅读障碍检测模型(DDM)。DL方法促进了多模态数据的集成。然而,很少有基于多模态的DDM。
    在这项研究中,作者使用多模态数据构建了基于DL的DDM。挤压和激励(SE)集成的MobileNetV3模型,基于自我注意机制(SA)的EfficientNetB7模型,并开发了早期停止和基于SA的双向长短期记忆(Bi-LSTM)模型,以从磁共振成像(MRI)中提取特征,功能性MRI,和脑电图(EEG)数据。此外,作者使用Hyperband优化技术对LightGBM模型进行了微调,以使用提取的特征检测阅读障碍。包含FMRI的三个数据集,MRI,和EEG数据用于评估拟议的DDM的性能。
    这些发现支持了拟议的DDM在有限的计算资源下检测阅读障碍的重要性。所提出的模型优于现有的DDM,产生98.9%的最佳精度,98.6%,功能磁共振成像占98.8%,MRI,和EEG数据集,分别。医疗中心和教育机构可以从所提出的模型中受益,以在初始阶段识别阅读障碍。通过集成基于视觉变换器的特征提取,可以提高所提出模型的可解释性。
    UNASSIGNED: Dyslexia is a neurological disorder that affects an individual\'s language processing abilities. Early care and intervention can help dyslexic individuals succeed academically and socially. Recent developments in deep learning (DL) approaches motivate researchers to build dyslexia detection models (DDMs). DL approaches facilitate the integration of multi-modality data. However, there are few multi-modality-based DDMs.
    UNASSIGNED: In this study, the authors built a DL-based DDM using multi-modality data. A squeeze and excitation (SE) integrated MobileNet V3 model, self-attention mechanisms (SA) based EfficientNet B7 model, and early stopping and SA-based Bi-directional long short-term memory (Bi-LSTM) models were developed to extract features from magnetic resonance imaging (MRI), functional MRI, and electroencephalography (EEG) data. In addition, the authors fine-tuned the LightGBM model using the Hyperband optimization technique to detect dyslexia using the extracted features. Three datasets containing FMRI, MRI, and EEG data were used to evaluate the performance of the proposed DDM.
    UNASSIGNED: The findings supported the significance of the proposed DDM in detecting dyslexia with limited computational resources. The proposed model outperformed the existing DDMs by producing an optimal accuracy of 98.9%, 98.6%, and 98.8% for the FMRI, MRI, and EEG datasets, respectively. Healthcare centers and educational institutions can benefit from the proposed model to identify dyslexia in the initial stages. The interpretability of the proposed model can be improved by integrating vision transformers-based feature extraction.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本研究的目的是针对我国城市天然气基础设施快速发展和扩张带来的复杂情况,提出一种天然气管道流量噪声信号的特征提取和模式识别方法,特别是在有活跃和废弃管道的情况下,金属和非金属管道,天然气,水和电力管道共存于城市的地下。因为地下情况未知,天然气管道破裂引起的气体泄漏事故时有发生,对人身安全构成威胁。因此,这项研究的动机是提供一种可行的方法来加速衰老,对城市天然气管道进行更新改造,保障城市天然气管网安全运行,促进城市经济高质量发展。通过实验测试和数值模拟相结合,本研究建立了城市天然气管道流量噪声信号数据库,并利用主成分分析(PCA)提取流量噪声信号的特征,并建立了特征提取的数学模型。然后,构建了基于反向传播神经网络(BPNN)的分类识别模型,从而实现对对流噪声信号的检测与识别。研究结果表明,基于声学特征分析的理论方法为城市天然气管网的有序安全建设提供了指导,保证了其安全运行。研究结论表明,通过对不同工况下75组燃气管道流动噪声的仿真分析。结合地面流量噪声信号的实验验证,本研究提出的特征提取和模式识别方法在强噪声背景下的识别准确率高达97%,验证了数值模拟的准确性,为城市燃气管道流动噪声的检测和识别提供了理论依据和技术支持。
    The purpose of this study is to put forward a feature extraction and pattern recognition method for the flow noise signal of natural gas pipelines in view of the complex situation brought by the rapid development and expansion of urban natural gas infrastructure in China, especially in the case that there are active and abandoned pipelines, metal and nonmetal pipelines, and natural gas, water and power pipelines coexist in the underground of the city. Because the underground situation is unknown, gas leakage incidents caused by natural gas pipeline rupture occur from time to time, posing a threat to personal safety. Therefore, the motivation of this study is to provide a feasible method to accelerate the aging, renewal and transformation of urban natural gas pipelines to ensure the safe operation of urban natural gas pipeline network and promote the high-quality development of urban economy. Through the combination of experimental test and numerical simulation, this study establishes a database of urban natural gas pipeline flow noise signals, and uses principal component analysis (PCA) to extract the characteristics of flow noise signals, and develops a mathematical model for feature extraction. Then, a classification and recognition model based on backpropagation neural network (BPNN) is constructed, which realizes the detection and recognition of convective noise signals. The research results show that the theoretical method based on acoustic feature analysis provides guidance for the orderly and safe construction of urban natural gas pipeline network and ensures its safe operation. The research conclusion shows that through the simulation analysis of 75 groups of gas pipeline flow noise under different working conditions. Combined with the experimental verification of ground flow noise signals, the feature extraction and pattern recognition method proposed in this study has a recognition accuracy of up to 97% under strong noise background, which confirms the accuracy of numerical simulation and provides theoretical basis and technical support for the detection and recognition of urban gas pipeline flow noise.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    为了提高金属氧化物半导体电子鼻(MOS电子鼻)检测不同新鲜度鱼粉样品的总挥发性碱性氮(TVB-N)和酸值(AV)的分类和回归性能,402个原始特征,62个手动提取的特征,通过RFRFE方法手动提取和选择特征,并将长短期记忆(LSTM)网络提取的特征作为输入来识别新鲜度。比较了具有不同新鲜度的鱼粉的新鲜度等级的分类性能以及TVB-N和AV值的估计性能。根据传感器响应曲线,预处理和特征提取步骤首先应用于原始数据。然后,采用五种分类算法和四种回归算法进行建模。结果表明,使用LSTM网络总共提取了30个特征,并且提取的特征数显着减少。在分类中,用支持向量机方法获得了95.4%的最高准确率。在回归中,最小二乘支持向量回归法获得了最好的均方根误差(RMSE)。决定系数(R2),RMSE,TVBN预测值与实际值的相对标准偏差(RSD)分别为0.963、11.01和7.9%,分别。R2,RMSE,AV预测值与实际值的RSD分别为0.972、0.170和6.05%,分别。LSTM特征提取方法为使用电子鼻进行特征提取以识别其他动物来源的材料样本提供了新的方法和参考。
    To improve the classification and regression performance of the total volatile basic nitrogen (TVB-N) and acid value (AV) of different freshness fish meal samples detected by a metal-oxide semiconductor electronic nose (MOS e-nose), 402 original features, 62 manually extracted features, manually extracted and selected features by the RFRFE method, and the features extracted by the long short-term memory (LSTM) network were used as inputs to identify the freshness. The classification performance of the freshness grades and the estimation performance of the TVB-N and AV values of fish meal with different freshness were compared. According to the sensor response curve, preprocessing and feature extraction steps were first applied to the original data. Then, five classification algorithms and four regression algorithms were used for modeling. The results showed that a total of 30 features were extracted using the LSTM network, and the number of extracted features was significantly reduced. In the classification, the highest accuracy rate of 95.4% was obtained using the support vector machine method. In the regression, the least squares support vector regression method obtained the best root mean square error (RMSE). The coefficient of determination (R2), RMSE, and relative standard deviation (RSD) between the predicted value of TVBN and the actual value were 0.963, 11.01, and 7.9%, respectively. The R2, RMSE, and RSD between the predicted value of AV and the actual value were 0.972, 0.170, and 6.05%, respectively. The LSTM feature extraction method provided a new method and reference for feature extraction using an E-nose to identify other animal-derived material samples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    全景立体视频以其沉浸和立体效果为观众带来了全新的视觉体验。在全景立体视频中,脸是一个重要的元素。然而,全景立体视频中的人脸图像有不同程度的变形。这给人脸识别带来了新的挑战。因此,提出了一种适用于全景立体视频的人脸识别模型DCM2Net(DeformableConvolutionMobileFaceNet)。该模型在特征融合过程中主要对通道间的特征信息进行融合,在网络深处的信道之间重新分配信息,并充分利用不同通道之间的信息进行特征提取。本文还搭建了全景立体视频直播系统,使用DCM2Net模型识别全景立体视频中的人脸,识别结果显示在视频中。在不同的数据集上进行实验后,结果表明,我们的模型在流行数据集和全景数据集上都有更好的结果。
    The panoramic stereo video has brought a new visual experience for the audience with its immersion and stereo effect. In panoramic stereo video, the face is an important element. However, the face image in panoramic stereo video has varying degrees of deformation. This brings new challenges to face recognition. Therefore, this paper proposes a face recognition model DCM2Net (Deformable Convolution MobileFaceNet) for panoramic stereo video. The model mainly integrates the feature information between channels during feature fusion, redistributes the information between channels in the deeper part of the network, and fully uses the information between different channels for feature extraction. This paper also built a panoramic stereo video live system, using the DCM2Net model to recognize the face in panoramic stereo video, and the recognition results are displayed in the video. After experiments on different datasets, the results show that our model has better results on popular datasets and panoramic datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在这项研究中,目标是开发一种检测和分类水体中有机磷农药(OPP)的方法。为每种有机磷农药制备了65个不同浓度的样品,即毒死蜱,乙酰甲胺磷,甲基对硫磷,敌百虫,敌敌畏,profenofos,马拉硫磷,乐果,fenthion,和辛硫磷,分别。首先,所有样品的光谱数据均使用紫外可见光谱仪获得。其次,五种预处理方法,六种流形学习方法,并利用五种机器学习算法建立了识别水体中OPP的检测模型。研究结果表明,在使用卷积平滑+一阶导数(SG+FD)预处理的数据上训练的机器学习模型的准确性优于在使用其他方法预处理的数据上训练的模型。反向传播神经网络(BPNN)模型的准确率最高,达到99.95%,其次是支持向量机(SVM)和卷积神经网络(CNN)模型,均为99.92%。极限学习机(ELM)和K最近邻(KNN)模型的准确率分别为99.84%和99.81%。分别。为了降维的目的,将流形学习算法应用于全波长数据集之后,然后将数据在前三个维度中可视化。结果表明,t-分布式域嵌入(t-SNE)算法具有良好的性能,表现出相似簇的密集聚类和不同簇的清晰分类。SGFD-t-SNE-SVM在性能方面在特征提取模型中排名最高。特征提取维数为4,平均分类准确率为99.98%,与全波长模型相比,这略微提高了预测性能。如这项研究所示,紫外-可见(UV-visible)光谱系统结合t-SNE和SVM算法可以有效地识别和分类水体中的OPP。
    In this study, the goal was to develop a method for detecting and classifying organophosphorus pesticides (OPPs) in bodies of water. Sixty-five samples with different concentrations were prepared for each of the organophosphorus pesticides, namely chlorpyrifos, acephate, parathion-methyl, trichlorphon, dichlorvos, profenofos, malathion, dimethoate, fenthion, and phoxim, respectively. Firstly, the spectral data of all the samples was obtained using a UV-visible spectrometer. Secondly, five preprocessing methods, six manifold learning methods, and five machine learning algorithms were utilized to build detection models for identifying OPPs in water bodies. The findings indicate that the accuracy of machine learning models trained on data preprocessed using convolutional smoothing + first-order derivatives (SG + FD) outperforms that of models trained on data preprocessed using other methods. The backpropagation neural network (BPNN) model exhibited the highest accuracy rate at 99.95%, followed by the support vector machine (SVM) and convolutional neural network (CNN) models, both at 99.92%. The extreme learning machine (ELM) and K-nearest neighbors (KNN) models demonstrated accuracy rates of 99.84% and 99.81%, respectively. Following the application of a manifold learning algorithm to the full-wavelength data set for the purpose of dimensionality reduction, the data was then visualized in the first three dimensions. The results demonstrate that the t-distributed domain embedding (t-SNE) algorithm is superior, exhibiting dense clustering of similar clusters and clear classification of dissimilar ones. SG + FD-t-SNE-SVM ranks highest among the feature extraction models in terms of performance. The feature extraction dimension was set to 4, and the average classification accuracy was 99.98%, which slightly improved the prediction performance over the full-wavelength model. As shown in this study, the ultraviolet-visible (UV-visible) spectroscopy system combined with the t-SNE and SVM algorithms can effectively identify and classify OPPs in waterbodies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号