K-nearest neighbor

K - 最近邻
  • 文章类型: Journal Article
    视力障碍的患病率正以惊人的速度增长。该研究的目标是创建一种自动化方法,该方法使用光学相干断层扫描(OCT)将视网膜疾病分为四类:脉络膜新生血管,糖尿病性黄斑水肿,玻璃疣,和正常病例。这项研究提出了一个新的框架,结合了机器学习和基于深度学习的技术。使用的分类器是支持向量机(SVM),K-近邻(K-NN),决策树(DT),和集成模型(EM)。特征提取器,InceptionV3卷积神经网络,也被雇用了。使用18000张OCT图像的数据集针对9个标准评估模型的性能。对于SVM,K-NN,DT,和EM分类器,分析显示了最先进的表现,分类准确率为99.43%,99.54%,97.98%,99.31%,分别。已经引入了一种有前途的方法来自动识别和分类视网膜疾病,减少人为错误,节省时间。 .
    The prevalence of vision impairment is increasing at an alarming rate. The goal of the study was to create an automated method that uses optical coherence tomography (OCT) to classify retinal disorders into four categories: choroidal neovascularization, diabetic macular edema, drusen, and normal cases. This study proposed a new framework that combines machine learning and deep learning-based techniques. The utilized classifiers were support vector machine (SVM), K-nearest neighbor (K-NN), decision tree (DT), and ensemble model (EM). A feature extractor, the InceptionV3 convolutional neural network, was also employed. The performance of the models was evaluated against nine criteria using a dataset of 18000 OCT images. For the SVM, K-NN, DT, and EM classifiers, the analysis exhibited state-of-the-art performance, with classification accuracies of 99.43%, 99.54%, 97.98%, and 99.31%, respectively. A promising methodology has been introduced for the automatic identification and classification of retinal disorders, leading to reduced human error and saved time.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近年来,使用人工智能算法对色素性皮肤病变进行分类的准确性有了显著提高。智能分析和分类系统明显优于皮肤科医生和肿瘤学家使用的视觉诊断方法。然而,由于缺乏通用性和潜在错误分类的风险,此类系统在临床实践中的应用受到严重限制。在临床病理实践中成功实施基于人工智能的工具需要对现有模型的有效性和性能进行全面研究,以及潜在研究发展的进一步有希望的领域。本系统综述的目的是调查和评估人工智能技术用于检测色素性皮肤病变的恶性形式的准确性。对于这项研究,从电子科学出版商中选择了10,589篇科学研究和评论文章,其中171篇文章被纳入本系统综述。所有选定的科学文章都根据所提出的神经网络算法从机器学习到多模态智能架构进行分发,并在手稿的相应部分进行了描述。这项研究旨在探索自动皮肤癌识别系统,从简单的机器学习算法到基于高级编码器-解码器模型的多模态集成系统,视觉变压器(ViT),以及生成和尖峰神经网络。此外,作为分析的结果,未来的研究方向,前景,并讨论了进一步开发用于对色素性皮肤病变进行分类的自动神经网络系统的潜力。
    In recent years, there has been a significant improvement in the accuracy of the classification of pigmented skin lesions using artificial intelligence algorithms. Intelligent analysis and classification systems are significantly superior to visual diagnostic methods used by dermatologists and oncologists. However, the application of such systems in clinical practice is severely limited due to a lack of generalizability and risks of potential misclassification. Successful implementation of artificial intelligence-based tools into clinicopathological practice requires a comprehensive study of the effectiveness and performance of existing models, as well as further promising areas for potential research development. The purpose of this systematic review is to investigate and evaluate the accuracy of artificial intelligence technologies for detecting malignant forms of pigmented skin lesions. For the study, 10,589 scientific research and review articles were selected from electronic scientific publishers, of which 171 articles were included in the presented systematic review. All selected scientific articles are distributed according to the proposed neural network algorithms from machine learning to multimodal intelligent architectures and are described in the corresponding sections of the manuscript. This research aims to explore automated skin cancer recognition systems, from simple machine learning algorithms to multimodal ensemble systems based on advanced encoder-decoder models, visual transformers (ViT), and generative and spiking neural networks. In addition, as a result of the analysis, future directions of research, prospects, and potential for further development of automated neural network systems for classifying pigmented skin lesions are discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    (1)背景:由于鸡蛋标签错误的可能性,与蛋鸡住房系统相关的鸡蛋真实性很容易受到食品欺诈的影响。(2)方法:共4188个蛋黄,从饲养在菌落笼中的四种不同品种的蛋鸡中获得,谷仓,自由范围,和有机系统,使用1HNMR光谱分析。所得1HNMR光谱的数据用于不同的机器学习方法,以建立四个住房系统的分类模型。(3)结果:7个计算模型的比较表明,支持向量机(SVM)模型给出了最好的结果,交叉验证准确率为98.5%。对来自超市的鸡蛋的分类模型的测试表明,根据鸡蛋上标记的住房系统,最多只有62.8%的样品进行了分类。(4)结论:与文献相比,本研究开发的分类模型包含的样本量最大。SVM模型最适用于母鸡住房系统的1HNMR数据评估。与超市样品的测试表明,更真实的样品来分析影响因素,如品种,喂养,和住房的变化是必需的。
    (1) Background: The authenticity of eggs in relation to the housing system of laying hens is susceptible to food fraud due to the potential for egg mislabeling. (2) Methods: A total of 4188 egg yolks, obtained from four different breeds of laying hens housed in colony cage, barn, free-range, and organic systems, were analyzed using 1H NMR spectroscopy. The data of the resulting 1H NMR spectra were used for different machine learning methods to build classification models for the four housing systems. (3) Results: The comparison of the seven computed models showed that the support vector machine (SVM) model gave the best results with a cross-validation accuracy of 98.5%. The test of classification models with eggs from supermarkets showed that only a maximum of 62.8% of samples were classified according to the housing system labeled on the eggs. (4) Conclusion: The classification models developed in this study included the largest sample size compared to the literature. The SVM model is most suitable for evaluating 1H NMR data in terms of the hen housing system. The test with supermarket samples showed that more authentic samples to analyze influencing factors such as breed, feeding, and housing changes are required.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    动态无线充电(DWC)已成为一种可行的方法,可以通过确保对行驶中的电动汽车进行连续和不间断的充电来减轻范围焦虑。DWC系统依赖于发射器的长度,可以分为长轨道发射机和分段线圈阵列。分段线圈阵列,因其提高效率和减少电磁干扰而受到青睐,成为首选。然而,在这样的DWC系统中,需要检测车辆的位置,特别地,以激活与接收器垫对准的发射器线圈并且去激励未耦合的发射器线圈。本文介绍了用于精确确定车辆位置的各种机器学习算法,适应不同的离地间隙的电动汽车和各种速度。通过测试八种不同的机器学习算法并比较结果,随机森林算法脱颖而出,显示预测实际位置的最低误差。
    Dynamic wireless charging (DWC) has emerged as a viable approach to mitigate range anxiety by ensuring continuous and uninterrupted charging for electric vehicles in motion. DWC systems rely on the length of the transmitter, which can be categorized into long-track transmitters and segmented coil arrays. The segmented coil array, favored for its heightened efficiency and reduced electromagnetic interference, stands out as the preferred option. However, in such DWC systems, the need arises to detect the vehicle\'s position, specifically to activate the transmitter coils aligned with the receiver pad and de-energize uncoupled transmitter coils. This paper introduces various machine learning algorithms for precise vehicle position determination, accommodating diverse ground clearances of electric vehicles and various speeds. Through testing eight different machine learning algorithms and comparing the results, the random forest algorithm emerged as superior, displaying the lowest error in predicting the actual position.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Clinical Study
    在日常生活中,声学因素和社会环境都会影响听力投入。在实验室环境中,关于倾听努力的信息已经独立地从瞳孔和心血管反应中推导出来。这些措施可以在多大程度上共同预测听力相关因素是未知的。在这里,我们结合了瞳孔和心血管特征来预测语音感知的声学和上下文方面。数据来自29名成年人(平均=64.6岁,SD=9.2)伴听力损失。参与者在两个个性化的信噪比(对应于正确句子的50%和80%)和两个社交环境(两个观察者的存在和不存在)下执行了语音感知任务。每个试验提取七个特征:基线瞳孔大小,瞳孔扩张峰值,平均瞳孔扩张,跳间间隔,血容量脉冲振幅,射前周期和脉冲到达时间。这些特征被用来训练k-最近邻分类器来预测任务需求,社会语境和句子准确性。对组级数据的k倍交叉验证揭示了高于机会分类的准确性:任务需求,64.4%;社会背景,78.3%;句子准确性,55.1%。然而,当分类器在不同参与者的数据上进行训练和测试时,分类准确性降低.单独训练的分类器(每个参与者一个)比小组级别的分类器表现更好:任务需求为71.7%(SD=10.2),社会背景下88.0%(标准差=7.5),句子准确性为60.0%(SD=13.1)。我们证明了分类器在小组水平的生理数据上进行了训练,以预测言语感知的各个方面,对新参与者的推广效果较差。单独校准的分类器为未来的应用带来了更多的希望。
    In daily life, both acoustic factors and social context can affect listening effort investment. In laboratory settings, information about listening effort has been deduced from pupil and cardiovascular responses independently. The extent to which these measures can jointly predict listening-related factors is unknown. Here we combined pupil and cardiovascular features to predict acoustic and contextual aspects of speech perception. Data were collected from 29 adults (mean  =  64.6 years, SD  =  9.2) with hearing loss. Participants performed a speech perception task at two individualized signal-to-noise ratios (corresponding to 50% and 80% of sentences correct) and in two social contexts (the presence and absence of two observers). Seven features were extracted per trial: baseline pupil size, peak pupil dilation, mean pupil dilation, interbeat interval, blood volume pulse amplitude, pre-ejection period and pulse arrival time. These features were used to train k-nearest neighbor classifiers to predict task demand, social context and sentence accuracy. The k-fold cross validation on the group-level data revealed above-chance classification accuracies: task demand, 64.4%; social context, 78.3%; and sentence accuracy, 55.1%. However, classification accuracies diminished when the classifiers were trained and tested on data from different participants. Individually trained classifiers (one per participant) performed better than group-level classifiers: 71.7% (SD  =  10.2) for task demand, 88.0% (SD  =  7.5) for social context, and 60.0% (SD  =  13.1) for sentence accuracy. We demonstrated that classifiers trained on group-level physiological data to predict aspects of speech perception generalized poorly to novel participants. Individually calibrated classifiers hold more promise for future applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本研究介绍了一种使用解释比(ER)线性判别分析(LDA)对多级离心泵(MCP)进行故障诊断的创新方法。最初,该方法通过识别故障敏感频段(FSFB)来解决振动信号中背景噪声和干扰的挑战。从FSFB,及时提取原始混合统计特征,频率,和时频域,形成一个全面的功能池。认识到并非所有特征都能充分代表MCP条件,并且会降低分类准确性,我们提出了一种新的ER-LDA方法。ER-LDA通过计算类间距离和类内散射之间的解释比率来评估特征重要性,通过LDA促进判别特征的选择。基于ER的特征评估和LDA的这种融合产生了新颖的ER-LDA技术。然后,将得到的选择性特征集传递给k-最近邻(K-NN)算法进行条件分类,区分正常,机械密封孔,机械密封划痕,以及MCP的叶轮缺陷状态。所提出的技术在故障分类方面超越了当前的尖端技术。
    This study introduces an innovative approach for fault diagnosis of a multistage centrifugal pump (MCP) using explanatory ratio (ER) linear discriminant analysis (LDA). Initially, the method addresses the challenge of background noise and interference in vibration signals by identifying a fault-sensitive frequency band (FSFB). From the FSFB, raw hybrid statistical features are extracted in time, frequency, and time-frequency domains, forming a comprehensive feature pool. Recognizing that not all features adequately represent MCP conditions and can reduce classification accuracy, we propose a novel ER-LDA method. ER-LDA evaluates feature importance by calculating the explanatory ratio between interclass distance and intraclass scatteredness, facilitating the selection of discriminative features through LDA. This fusion of ER-based feature assessment and LDA yields the novel ER-LDA technique. The resulting selective feature set is then passed into a k-nearest neighbor (K-NN) algorithm for condition classification, distinguishing between normal, mechanical seal hole, mechanical seal scratch, and impeller defect states of the MCP. The proposed technique surpasses current cutting-edge techniques in fault classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究开发了一种解决方案,使用各种机器学习模型来检测体育比赛固定异常,基于投注赔率。我们使用五个模型来区分正常匹配和异常匹配:逻辑回归(LR),随机森林(RF),支持向量机(SVM),k-最近邻(KNN)分类,和集成模型-从前四个优化的模型。模型通过使用体育博彩赔率数据学习其模式来对正常和异常比赛进行分类。该数据库是根据12家博彩公司的世界足球联赛比赛博彩数据开发的,提供了大量关于玩家的数据,团队,游戏时间表,和足球比赛的联赛排名。根据各模型的数据分析结果,建立异常匹配检测模型,使用匹配结果红利数据。然后,我们使用来自实时匹配的数据,并应用这五个模型来构建一个能够实时检测匹配修复的系统。RF,KNN,合奏模型记录了很高的准确性,92%以上,而LR和SVM模型的准确率约为80%。相比之下,以前的研究使用单一模型来检查足球比赛投注赔率数据,准确率为70-80%。
    This study develops a solution to sports match-fixing using various machine-learning models to detect match-fixing anomalies, based on betting odds. We use five models to distinguish between normal and abnormal matches: logistic regression (LR), random forest (RF), support vector machine (SVM), the k-nearest neighbor (KNN) classification, and the ensemble model-a model optimized from the previous four. The models classify normal and abnormal matches by learning their patterns using sports betting odds data. The database was developed based on the world football league match betting data of 12 betting companies, which offered a vast collection of data on players, teams, game schedules, and league rankings for football matches. We develop an abnormal match detection model based on the data analysis results of each model, using the match result dividend data. We then use data from real-time matches and apply the five models to construct a system capable of detecting match-fixing in real time. The RF, KNN, and ensemble models recorded a high accuracy, over 92%, whereas the LR and SVM models were approximately 80% accurate. In comparison, previous studies have used a single model to examine football match betting odds data, with an accuracy of 70-80%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    癫痫是一种由神经元网络的突然电失衡引发的脑部疾病。脑电图(EEG)是一种诊断工具,用于捕获潜在的大脑机制并检测癫痫患者的癫痫发作。为了检测癫痫发作,神经科医生需要长时间手动监测脑电图记录,这是具有挑战性的,容易出错,这取决于专业知识和经验。因此,自动识别癫痫发作和无癫痫脑电图信号变得至关重要。本研究介绍了一种基于从相空间重构中提取的特征对癫痫发作和无癫痫发作的脑电信号进行分类的方法。通过改变50%至100%范围内的数据点的百分比值,从欧几里得距离的椭圆面积和四分位数范围得出计算的特征。我们考虑两个公共数据集,并评估每个EEG时期的这些特征,包括健康,发作间,预先发作,癫痫患者的发作期,利用K最近邻分类器进行分类。结果表明,癫痫发作期间的特征值高于无癫痫发作的EEG信号和健康受试者。此外,所提出的特征可以有效区分癫痫EEG信号与无癫痫和正常受试者的100%的准确性,灵敏度,和两个数据集中的特异性。同样,发作前阶段和癫痫发作脑电图信号之间的分类达到98%的准确性。总的来说,与现有方法相比,重建的相空间特征显着提高了癫痫脑电信号检测的准确性。这一进展在协助神经科医生从EEG信号迅速准确地诊断癫痫发作方面具有巨大潜力。
    Epilepsy is a type of brain disorder triggered by an abrupt electrical imbalance of neuronal networks. An electroencephalogram (EEG) is a diagnostic tool to capture the underlying brain mechanisms and detect seizure onset in epileptic patients. To detect seizures, neurologists need to manually monitor EEG recordings for long periods, which is challenging and susceptible to errors depending on expertise and experience. Therefore, automatic identification of seizure and seizure-free EEG signals becomes essential. This study introduces a method based on the features extracted from the phase space reconstruction for classifying seizure and seizure-free EEG signals. The computed features are derived from the elliptical area and interquartile range of the Euclidean distance by varying percentage values of data points ranging from 50 to 100%. We consider two public datasets and evaluate these features in each EEG epoch that includes the healthy, interictal, preictal, and ictal stages of epileptic subjects, utilizing the K-nearest neighbor classifier for classification. Results show that the features have higher values during the seizure than the seizure-free EEG signals and healthy subjects. Furthermore, the proposed features can effectively discriminate seizure EEG signals from the seizure-free and normal subjects with 100% accuracy, sensitivity, and specificity in both datasets. Likewise, the classification between the preictal stage and seizure EEG signals attains 98% accuracy. Overall, the reconstructed phase space features significantly enhance the accuracy of detecting epileptic EEG signals compared with existing methods. This advancement holds great potential in assisting neurologists in swiftly and accurately diagnosing epileptic seizures from EEG signals.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自闭症谱系障碍(ASD)是一种神经发育障碍。ASD不能完全治愈,但是早期诊断后的治疗和康复有助于自闭症患者过上高质量的生活。通过问卷调查和筛查测试(如自闭症频谱商10(AQ-10)和幼儿自闭症定量检查表(Q-chat))对ASD症状进行临床诊断是昂贵的,无法访问,和耗时的过程。机器学习(ML)技术有助于在诊断的初始阶段轻松预测ASD。这项工作的主要目的是使用ML分类器对ASD和典型开发(TD)类数据进行分类。在我们的工作中,我们使用了所有年龄组的不同ASD数据集(幼儿,成年人,孩子们,和青少年)对ASD和TD病例进行分类。我们实现了One-Hot编码,以在预处理期间将分类数据转换为数值数据。然后,我们使用kNNImputer和MinMaxScaler功能转换来处理缺失值和数据规范化。使用支持向量机对ASD和TD类数据进行分类,k-最近邻(KNN),随机森林(RF),和人工神经网络分类器。对于所有四种类型的数据集,RF在100%的准确性方面提供了最佳性能,并且没有过度拟合问题。我们还通过已经发表的工作检查了我们的结果,包括深度神经网络(DNN)和卷积神经网络(CNN)等最新方法。即使使用像DNN和CNN这样的复杂架构,我们提出的方法提供了最好的结果与低复杂度模型。相比之下,现有方法的准确率高达98%,对数损失高达15%。我们提出的方法证明了在临床试验中实时ASD检测的改进推广。
    Autism spectrum disorder (ASD) is a neurodevelopmental disorder. ASD cannot be fully cured, but early-stage diagnosis followed by therapies and rehabilitation helps an autistic person to live a quality life. Clinical diagnosis of ASD symptoms via questionnaire and screening tests such as Autism Spectrum Quotient-10 (AQ-10) and Quantitative Check-list for Autism in Toddlers (Q-chat) are expensive, inaccessible, and time-consuming processes. Machine learning (ML) techniques are beneficial to predict ASD easily at the initial stage of diagnosis. The main aim of this work is to classify ASD and typical developed (TD) class data using ML classifiers. In our work, we have used different ASD data sets of all age groups (toddlers, adults, children, and adolescents) to classify ASD and TD cases. We implemented One-Hot encoding to translate categorical data into numerical data during preprocessing. We then used kNN Imputer with MinMaxScaler feature transformation to handle missing values and data normalization. ASD and TD class data is classified using Support vector machine, k-nearest-neighbor (KNN), random forest (RF), and artificial neural network classifiers. RF gives the best performance in terms of the accuracy of 100% with different training and testing data split for all four types of data sets and has no over-fitting issue. We have also examined our results with already published work, including recent methods like Deep Neural Network (DNN) and Convolution Neural Network (CNN). Even using complex architectures like DNN and CNN, our proposed methods provide the best results with low-complexity models. In contrast, existing methods have shown accuracy upto 98% with log-loss upto 15%. Our proposed methodology demonstrates the improved generalization for real-time ASD detection during clinical trials.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在本文中,介绍了一种使用进化算法进行高维特征选择以自动分类冠状动脉狭窄的新策略.该方法涉及特征提取阶段,以形成473个特征库,考虑到不同类型,例如强度,纹理和形状。在高维特征库上执行特征选择任务,其中搜索空间由O(2n)表示,n=473。在Jaccard系数和精度分类方面,使用不同的最新方法对所提出的进化搜索策略进行了比较。最高的功能选择率,以及最佳的分类性能,是用四个特征的子集获得的,代表99%的歧视率。在最后阶段,特征子集被用作输入,使用独立的测试集训练支持向量机.冠状动脉狭窄病例的分类涉及通过考虑阳性和阴性类别的二元分类类型。在准确度(0.86)和Jaccard系数(0.75)度量方面,四特征子集获得了最高的分类性能。此外,包含2788个实例的第二个数据集是由公共图像数据库形成的,获得0.89的精度和0.80的Jaccard系数。最后,基于四特征子集实现的性能,它们可以适用于临床决策支持系统。
    In this paper, a novel strategy to perform high-dimensional feature selection using an evolutionary algorithm for the automatic classification of coronary stenosis is introduced. The method involves a feature extraction stage to form a bank of 473 features considering different types such as intensity, texture and shape. The feature selection task is carried out on a high-dimensional feature bank, where the search space is denoted by O(2n) and n=473. The proposed evolutionary search strategy was compared in terms of the Jaccard coefficient and accuracy classification with different state-of-the-art methods. The highest feature selection rate, along with the best classification performance, was obtained with a subset of four features, representing a 99% discrimination rate. In the last stage, the feature subset was used as input to train a support vector machine using an independent testing set. The classification of coronary stenosis cases involves a binary classification type by considering positive and negative classes. The highest classification performance was obtained with the four-feature subset in terms of accuracy (0.86) and Jaccard coefficient (0.75) metrics. In addition, a second dataset containing 2788 instances was formed from a public image database, obtaining an accuracy of 0.89 and a Jaccard Coefficient of 0.80. Finally, based on the performance achieved with the four-feature subset, they can be suitable for use in a clinical decision support system.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号