K-nearest neighbor

K - 最近邻
  • 文章类型: Journal Article
    主动干扰和其他雷达的相互干扰严重损害了雷达的检测性能。本文提出了一种无线电信号调制识别方法来准确识别这些信号,这有助于干扰取消决定。基于元特征增强改进的集成学习堆叠算法,所提出的方法采用随机森林,K-最近的邻居,和高斯朴素贝叶斯作为基础学习者,逻辑回归作为元学习者。它以信号的多域特征作为输入,包括模糊熵在内的时域特征,斜率熵,和Hjorth参数;频域特征,包括谱熵;和分形域特征,包括分形维数。模拟实验,包括雷达和有源干扰的七种常见信号类型,进行有效性验证和性能评估。结果证明了该方法相对于其他分类方法的性能优势,以及其满足低信噪比和少射学习要求的能力。
    The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning stacking algorithm improved by meta-feature enhancement, the proposed method adopts random forests, K-nearest neighbors, and Gaussian naive Bayes as the base-learners, with logistic regression serving as the meta-learner. It takes the multi-domain features of signals as input, which include time-domain features including fuzzy entropy, slope entropy, and Hjorth parameters; frequency-domain features, including spectral entropy; and fractal-domain features, including fractal dimension. The simulation experiment, including seven common signal types of radar and active jamming, was performed for the effectiveness validation and performance evaluation. Results proved the proposed method\'s performance superiority to other classification methods, as well as its ability to meet the requirements of low signal-to-noise ratio and few-shot learning.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    (1)背景:由于鸡蛋标签错误的可能性,与蛋鸡住房系统相关的鸡蛋真实性很容易受到食品欺诈的影响。(2)方法:共4188个蛋黄,从饲养在菌落笼中的四种不同品种的蛋鸡中获得,谷仓,自由范围,和有机系统,使用1HNMR光谱分析。所得1HNMR光谱的数据用于不同的机器学习方法,以建立四个住房系统的分类模型。(3)结果:7个计算模型的比较表明,支持向量机(SVM)模型给出了最好的结果,交叉验证准确率为98.5%。对来自超市的鸡蛋的分类模型的测试表明,根据鸡蛋上标记的住房系统,最多只有62.8%的样品进行了分类。(4)结论:与文献相比,本研究开发的分类模型包含的样本量最大。SVM模型最适用于母鸡住房系统的1HNMR数据评估。与超市样品的测试表明,更真实的样品来分析影响因素,如品种,喂养,和住房的变化是必需的。
    (1) Background: The authenticity of eggs in relation to the housing system of laying hens is susceptible to food fraud due to the potential for egg mislabeling. (2) Methods: A total of 4188 egg yolks, obtained from four different breeds of laying hens housed in colony cage, barn, free-range, and organic systems, were analyzed using 1H NMR spectroscopy. The data of the resulting 1H NMR spectra were used for different machine learning methods to build classification models for the four housing systems. (3) Results: The comparison of the seven computed models showed that the support vector machine (SVM) model gave the best results with a cross-validation accuracy of 98.5%. The test of classification models with eggs from supermarkets showed that only a maximum of 62.8% of samples were classified according to the housing system labeled on the eggs. (4) Conclusion: The classification models developed in this study included the largest sample size compared to the literature. The SVM model is most suitable for evaluating 1H NMR data in terms of the hen housing system. The test with supermarket samples showed that more authentic samples to analyze influencing factors such as breed, feeding, and housing changes are required.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    动态无线充电(DWC)已成为一种可行的方法,可以通过确保对行驶中的电动汽车进行连续和不间断的充电来减轻范围焦虑。DWC系统依赖于发射器的长度,可以分为长轨道发射机和分段线圈阵列。分段线圈阵列,因其提高效率和减少电磁干扰而受到青睐,成为首选。然而,在这样的DWC系统中,需要检测车辆的位置,特别地,以激活与接收器垫对准的发射器线圈并且去激励未耦合的发射器线圈。本文介绍了用于精确确定车辆位置的各种机器学习算法,适应不同的离地间隙的电动汽车和各种速度。通过测试八种不同的机器学习算法并比较结果,随机森林算法脱颖而出,显示预测实际位置的最低误差。
    Dynamic wireless charging (DWC) has emerged as a viable approach to mitigate range anxiety by ensuring continuous and uninterrupted charging for electric vehicles in motion. DWC systems rely on the length of the transmitter, which can be categorized into long-track transmitters and segmented coil arrays. The segmented coil array, favored for its heightened efficiency and reduced electromagnetic interference, stands out as the preferred option. However, in such DWC systems, the need arises to detect the vehicle\'s position, specifically to activate the transmitter coils aligned with the receiver pad and de-energize uncoupled transmitter coils. This paper introduces various machine learning algorithms for precise vehicle position determination, accommodating diverse ground clearances of electric vehicles and various speeds. Through testing eight different machine learning algorithms and comparing the results, the random forest algorithm emerged as superior, displaying the lowest error in predicting the actual position.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Clinical Study
    在日常生活中,声学因素和社会环境都会影响听力投入。在实验室环境中,关于倾听努力的信息已经独立地从瞳孔和心血管反应中推导出来。这些措施可以在多大程度上共同预测听力相关因素是未知的。在这里,我们结合了瞳孔和心血管特征来预测语音感知的声学和上下文方面。数据来自29名成年人(平均=64.6岁,SD=9.2)伴听力损失。参与者在两个个性化的信噪比(对应于正确句子的50%和80%)和两个社交环境(两个观察者的存在和不存在)下执行了语音感知任务。每个试验提取七个特征:基线瞳孔大小,瞳孔扩张峰值,平均瞳孔扩张,跳间间隔,血容量脉冲振幅,射前周期和脉冲到达时间。这些特征被用来训练k-最近邻分类器来预测任务需求,社会语境和句子准确性。对组级数据的k倍交叉验证揭示了高于机会分类的准确性:任务需求,64.4%;社会背景,78.3%;句子准确性,55.1%。然而,当分类器在不同参与者的数据上进行训练和测试时,分类准确性降低.单独训练的分类器(每个参与者一个)比小组级别的分类器表现更好:任务需求为71.7%(SD=10.2),社会背景下88.0%(标准差=7.5),句子准确性为60.0%(SD=13.1)。我们证明了分类器在小组水平的生理数据上进行了训练,以预测言语感知的各个方面,对新参与者的推广效果较差。单独校准的分类器为未来的应用带来了更多的希望。
    In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean  =  64.6 years, SD  =  9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD  =  10.2) for task demand, 88.0% (SD  =  7.5) for social context, and 60.0% (SD  =  13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本研究介绍了一种使用解释比(ER)线性判别分析(LDA)对多级离心泵(MCP)进行故障诊断的创新方法。最初,该方法通过识别故障敏感频段(FSFB)来解决振动信号中背景噪声和干扰的挑战。从FSFB,及时提取原始混合统计特征,频率,和时频域,形成一个全面的功能池。认识到并非所有特征都能充分代表MCP条件,并且会降低分类准确性,我们提出了一种新的ER-LDA方法。ER-LDA通过计算类间距离和类内散射之间的解释比率来评估特征重要性,通过LDA促进判别特征的选择。基于ER的特征评估和LDA的这种融合产生了新颖的ER-LDA技术。然后,将得到的选择性特征集传递给k-最近邻(K-NN)算法进行条件分类,区分正常,机械密封孔,机械密封划痕,以及MCP的叶轮缺陷状态。所提出的技术在故障分类方面超越了当前的尖端技术。
    This study introduces an innovative approach for fault diagnosis of a multistage centrifugal pump (MCP) using explanatory ratio (ER) linear discriminant analysis (LDA). Initially, the method addresses the challenge of background noise and interference in vibration signals by identifying a fault-sensitive frequency band (FSFB). From the FSFB, raw hybrid statistical features are extracted in time, frequency, and time-frequency domains, forming a comprehensive feature pool. Recognizing that not all features adequately represent MCP conditions and can reduce classification accuracy, we propose a novel ER-LDA method. ER-LDA evaluates feature importance by calculating the explanatory ratio between interclass distance and intraclass scatteredness, facilitating the selection of discriminative features through LDA. This fusion of ER-based feature assessment and LDA yields the novel ER-LDA technique. The resulting selective feature set is then passed into a k-nearest neighbor (K-NN) algorithm for condition classification, distinguishing between normal, mechanical seal hole, mechanical seal scratch, and impeller defect states of the MCP. The proposed technique surpasses current cutting-edge techniques in fault classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究开发了一种解决方案,使用各种机器学习模型来检测体育比赛固定异常,基于投注赔率。我们使用五个模型来区分正常匹配和异常匹配:逻辑回归(LR),随机森林(RF),支持向量机(SVM),k-最近邻(KNN)分类,和集成模型-从前四个优化的模型。模型通过使用体育博彩赔率数据学习其模式来对正常和异常比赛进行分类。该数据库是根据12家博彩公司的世界足球联赛比赛博彩数据开发的,提供了大量关于玩家的数据,团队,游戏时间表,和足球比赛的联赛排名。根据各模型的数据分析结果,建立异常匹配检测模型,使用匹配结果红利数据。然后,我们使用来自实时匹配的数据,并应用这五个模型来构建一个能够实时检测匹配修复的系统。RF,KNN,合奏模型记录了很高的准确性,92%以上,而LR和SVM模型的准确率约为80%。相比之下,以前的研究使用单一模型来检查足球比赛投注赔率数据,准确率为70-80%。
    This study develops a solution to sports match-fixing using various machine-learning models to detect match-fixing anomalies, based on betting odds. We use five models to distinguish between normal and abnormal matches: logistic regression (LR), random forest (RF), support vector machine (SVM), the k-nearest neighbor (KNN) classification, and the ensemble model-a model optimized from the previous four. The models classify normal and abnormal matches by learning their patterns using sports betting odds data. The database was developed based on the world football league match betting data of 12 betting companies, which offered a vast collection of data on players, teams, game schedules, and league rankings for football matches. We develop an abnormal match detection model based on the data analysis results of each model, using the match result dividend data. We then use data from real-time matches and apply the five models to construct a system capable of detecting match-fixing in real time. The RF, KNN, and ensemble models recorded a high accuracy, over 92%, whereas the LR and SVM models were approximately 80% accurate. In comparison, previous studies have used a single model to examine football match betting odds data, with an accuracy of 70-80%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    癫痫是一种由神经元网络的突然电失衡引发的脑部疾病。脑电图(EEG)是一种诊断工具,用于捕获潜在的大脑机制并检测癫痫患者的癫痫发作。为了检测癫痫发作,神经科医生需要长时间手动监测脑电图记录,这是具有挑战性的,容易出错,这取决于专业知识和经验。因此,自动识别癫痫发作和无癫痫脑电图信号变得至关重要。本研究介绍了一种基于从相空间重构中提取的特征对癫痫发作和无癫痫发作的脑电信号进行分类的方法。通过改变50%至100%范围内的数据点的百分比值,从欧几里得距离的椭圆面积和四分位数范围得出计算的特征。我们考虑两个公共数据集,并评估每个EEG时期的这些特征,包括健康,发作间,预先发作,癫痫患者的发作期,利用K最近邻分类器进行分类。结果表明,癫痫发作期间的特征值高于无癫痫发作的EEG信号和健康受试者。此外,所提出的特征可以有效区分癫痫EEG信号与无癫痫和正常受试者的100%的准确性,灵敏度,和两个数据集中的特异性。同样,发作前阶段和癫痫发作脑电图信号之间的分类达到98%的准确性。总的来说,与现有方法相比,重建的相空间特征显着提高了癫痫脑电信号检测的准确性。这一进展在协助神经科医生从EEG信号迅速准确地诊断癫痫发作方面具有巨大潜力。
    Epilepsy is a type of brain disorder triggered by an abrupt electrical imbalance of neuronal networks. An electroencephalogram (EEG) is a diagnostic tool to capture the underlying brain mechanisms and detect seizure onset in epileptic patients. To detect seizures, neurologists need to manually monitor EEG recordings for long periods, which is challenging and susceptible to errors depending on expertise and experience. Therefore, automatic identification of seizure and seizure-free EEG signals becomes essential. This study introduces a method based on the features extracted from the phase space reconstruction for classifying seizure and seizure-free EEG signals. The computed features are derived from the elliptical area and interquartile range of the Euclidean distance by varying percentage values of data points ranging from 50 to 100%. We consider two public datasets and evaluate these features in each EEG epoch that includes the healthy, interictal, preictal, and ictal stages of epileptic subjects, utilizing the K-nearest neighbor classifier for classification. Results show that the features have higher values during the seizure than the seizure-free EEG signals and healthy subjects. Furthermore, the proposed features can effectively discriminate seizure EEG signals from the seizure-free and normal subjects with 100% accuracy, sensitivity, and specificity in both datasets. Likewise, the classification between the preictal stage and seizure EEG signals attains 98% accuracy. Overall, the reconstructed phase space features significantly enhance the accuracy of detecting epileptic EEG signals compared with existing methods. This advancement holds great potential in assisting neurologists in swiftly and accurately diagnosing epileptic seizures from EEG signals.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自闭症谱系障碍(ASD)是一种神经发育障碍。ASD不能完全治愈,但是早期诊断后的治疗和康复有助于自闭症患者过上高质量的生活。通过问卷调查和筛查测试(如自闭症频谱商10(AQ-10)和幼儿自闭症定量检查表(Q-chat))对ASD症状进行临床诊断是昂贵的,无法访问,和耗时的过程。机器学习(ML)技术有助于在诊断的初始阶段轻松预测ASD。这项工作的主要目的是使用ML分类器对ASD和典型开发(TD)类数据进行分类。在我们的工作中,我们使用了所有年龄组的不同ASD数据集(幼儿,成年人,孩子们,和青少年)对ASD和TD病例进行分类。我们实现了One-Hot编码,以在预处理期间将分类数据转换为数值数据。然后,我们使用kNNImputer和MinMaxScaler功能转换来处理缺失值和数据规范化。使用支持向量机对ASD和TD类数据进行分类,k-最近邻(KNN),随机森林(RF),和人工神经网络分类器。对于所有四种类型的数据集,RF在100%的准确性方面提供了最佳性能,并且没有过度拟合问题。我们还通过已经发表的工作检查了我们的结果,包括深度神经网络(DNN)和卷积神经网络(CNN)等最新方法。即使使用像DNN和CNN这样的复杂架构,我们提出的方法提供了最好的结果与低复杂度模型。相比之下,现有方法的准确率高达98%,对数损失高达15%。我们提出的方法证明了在临床试验中实时ASD检测的改进推广。
    Autism spectrum disorder (ASD) is a neurodevelopmental disorder. ASD cannot be fully cured, but early-stage diagnosis followed by therapies and rehabilitation helps an autistic person to live a quality life. Clinical diagnosis of ASD symptoms via questionnaire and screening tests such as Autism Spectrum Quotient-10 (AQ-10) and Quantitative Check-list for Autism in Toddlers (Q-chat) are expensive, inaccessible, and time-consuming processes. Machine learning (ML) techniques are beneficial to predict ASD easily at the initial stage of diagnosis. The main aim of this work is to classify ASD and typical developed (TD) class data using ML classifiers. In our work, we have used different ASD data sets of all age groups (toddlers, adults, children, and adolescents) to classify ASD and TD cases. We implemented One-Hot encoding to translate categorical data into numerical data during preprocessing. We then used kNN Imputer with MinMaxScaler feature transformation to handle missing values and data normalization. ASD and TD class data is classified using Support vector machine, k-nearest-neighbor (KNN), random forest (RF), and artificial neural network classifiers. RF gives the best performance in terms of the accuracy of 100% with different training and testing data split for all four types of data sets and has no over-fitting issue. We have also examined our results with already published work, including recent methods like Deep Neural Network (DNN) and Convolution Neural Network (CNN). Even using complex architectures like DNN and CNN, our proposed methods provide the best results with low-complexity models. In contrast, existing methods have shown accuracy upto 98% with log-loss upto 15%. Our proposed methodology demonstrates the improved generalization for real-time ASD detection during clinical trials.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在本文中,介绍了一种使用进化算法进行高维特征选择以自动分类冠状动脉狭窄的新策略.该方法涉及特征提取阶段,以形成473个特征库,考虑到不同类型,例如强度,纹理和形状。在高维特征库上执行特征选择任务,其中搜索空间由O(2n)表示,n=473。在Jaccard系数和精度分类方面,使用不同的最新方法对所提出的进化搜索策略进行了比较。最高的功能选择率,以及最佳的分类性能,是用四个特征的子集获得的,代表99%的歧视率。在最后阶段,特征子集被用作输入,使用独立的测试集训练支持向量机.冠状动脉狭窄病例的分类涉及通过考虑阳性和阴性类别的二元分类类型。在准确度(0.86)和Jaccard系数(0.75)度量方面,四特征子集获得了最高的分类性能。此外,包含2788个实例的第二个数据集是由公共图像数据库形成的,获得0.89的精度和0.80的Jaccard系数。最后,基于四特征子集实现的性能,它们可以适用于临床决策支持系统。
    In this paper, a novel strategy to perform high-dimensional feature selection using an evolutionary algorithm for the automatic classification of coronary stenosis is introduced. The method involves a feature extraction stage to form a bank of 473 features considering different types such as intensity, texture and shape. The feature selection task is carried out on a high-dimensional feature bank, where the search space is denoted by O(2n) and n=473. The proposed evolutionary search strategy was compared in terms of the Jaccard coefficient and accuracy classification with different state-of-the-art methods. The highest feature selection rate, along with the best classification performance, was obtained with a subset of four features, representing a 99% discrimination rate. In the last stage, the feature subset was used as input to train a support vector machine using an independent testing set. The classification of coronary stenosis cases involves a binary classification type by considering positive and negative classes. The highest classification performance was obtained with the four-feature subset in terms of accuracy (0.86) and Jaccard coefficient (0.75) metrics. In addition, a second dataset containing 2788 instances was formed from a public image database, obtaining an accuracy of 0.89 and a Jaccard Coefficient of 0.80. Finally, based on the performance achieved with the four-feature subset, they can be suitable for use in a clinical decision support system.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    牛奶的质量与其品牌密切相关。一个知名品牌的牛奶总是质量很好。因此,本研究旨在设计一种新的模糊特征提取方法,称为模糊改进零线性判别分析(FINLDA),对收集的牛奶光谱进行聚类,以识别牛奶品牌。为了提高分类精度,将FiNLDA应用于处理便携式近红外光谱仪获得的牛奶近红外(NIR)光谱。主成分分析和Savitzky-Golay(SG)滤波算法用于降低该系统的维数并消除噪声。分别。此后,应用改进的零线性判别分析(iNLDA)和FiNLDA来获得近红外光谱的判别信息。最后,K最近邻分类器用于评估识别系统的性能。结果表明,LDA的最大分类精度,iNLDA和FiNLDA为74.7%,88%和94.67%,分别。因此,便携式近红外光谱仪与FINLDA相结合,可以正确有效地对牛奶品牌进行分类。
    The quality of milk is tightly linked to its brand. A famous brand of milk always has good quality. Therefore, this study seeks to design a new fuzzy feature extraction method, called fuzzy improved null linear discriminant analysis (FiNLDA), to cluster the spectra of collected milk for identifying milk brands. To elevate the classification accuracy, FiNLDA was applied to process the near-infrared (NIR) spectra of milk acquired by the portable near-infrared spectrometer. The principal component analysis and Savitzky-Golay (SG) filtering algorithm were employed to lower dimensionality and eliminate noise in this system, respectively. Thereafter, improved null linear discriminant analysis (iNLDA) and FiNLDA were applied to attain the discriminant information of the NIR spectra. At last, the K-nearest neighbor classifier was utilized for assessing the performance of the identification system. The results indicated that the maximum classification accuracies of LDA, iNLDA and FiNLDA were 74.7%, 88% and 94.67%, respectively. Accordingly, the portable NIR spectrometer in combination with FiNLDA can classify milk brands correctly and effectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号