K-nearest neighbor

K - 最近邻
  • 文章类型: Journal Article
    主动干扰和其他雷达的相互干扰严重损害了雷达的检测性能。本文提出了一种无线电信号调制识别方法来准确识别这些信号,这有助于干扰取消决定。基于元特征增强改进的集成学习堆叠算法,所提出的方法采用随机森林,K-最近的邻居,和高斯朴素贝叶斯作为基础学习者,逻辑回归作为元学习者。它以信号的多域特征作为输入,包括模糊熵在内的时域特征,斜率熵,和Hjorth参数;频域特征,包括谱熵;和分形域特征,包括分形维数。模拟实验,包括雷达和有源干扰的七种常见信号类型,进行有效性验证和性能评估。结果证明了该方法相对于其他分类方法的性能优势,以及其满足低信噪比和少射学习要求的能力。
    The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning stacking algorithm improved by meta-feature enhancement, the proposed method adopts random forests, K-nearest neighbors, and Gaussian naive Bayes as the base-learners, with logistic regression serving as the meta-learner. It takes the multi-domain features of signals as input, which include time-domain features including fuzzy entropy, slope entropy, and Hjorth parameters; frequency-domain features, including spectral entropy; and fractal-domain features, including fractal dimension. The simulation experiment, including seven common signal types of radar and active jamming, was performed for the effectiveness validation and performance evaluation. Results proved the proposed method\'s performance superiority to other classification methods, as well as its ability to meet the requirements of low signal-to-noise ratio and few-shot learning.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    超声信息熵是分析超声反向散射的灵活方法。基于概率分布直方图(PDHs)的香农熵成像已被实现为一种有前途的组织表征和诊断方法。然而,bin数影响熵估计的稳定性。在这项研究中,我们引入了k-最近邻(KNN)算法来估计熵值,并提出了超声KNN熵成像。提出的KNN估计器利用了数据样本之间的欧几里德距离,而不是传统的PDH估计器的直方图箱。我们还提出了累积相对熵(CRE)成像来分析时间序列射频信号,并将其应用于监测微波消融(MWA)引起的热损伤。进行了计算机模拟体模实验,以验证和比较所提出的KNN熵成像的性能,传统的PDH熵成像,和Nakagami-m参数成像在检测散射体密度变化和可视化夹杂物方面。分析乳腺病变的临床资料,和离体猪肝MWA实验,以验证KNN熵成像在分类乳腺良恶性肿瘤和监测热病变中的性能,分别。与PDH相比,基于KNN的熵估计受调谐参数的影响较小。与典型的Shannon熵(TSE)和Nakagami-m参数成像相比,KNN熵成像对散射体密度的变化更敏感,并且具有更好的可视化能力。在不同的成像方法中,基于KNN的Shannon熵(KSE)成像在良性和恶性乳腺肿瘤的分类中实现了更高的准确性,并且基于KNN的CRE成像在不同功率和治疗持续时间的MWA期间监测消融区域时具有更大的病变与正常对比。超声KNN熵成像是一种潜在的用于组织表征的定量超声方法。
    Ultrasound information entropy is a flexible approach for analyzing ultrasound backscattering. Shannon entropy imaging based on probability distribution histograms (PDHs) has been implemented as a promising method for tissue characterization and diagnosis. However, the bin number affects the stability of entropy estimation. In this study, we introduced the k-nearest neighbor (KNN) algorithm to estimate entropy values and proposed ultrasound KNN entropy imaging. The proposed KNN estimator leveraged the Euclidean distance between data samples, rather than the histogram bins by conventional PDH estimators. We also proposed cumulative relative entropy (CRE) imaging to analyze time-series radiofrequency signals and applied it to monitor thermal lesions induced by microwave ablation (MWA). Computer simulation phantom experiments were conducted to validate and compare the performance of the proposed KNN entropy imaging, the conventional PDH entropy imaging, and Nakagami-m parametric imaging in detecting the variations of scatterer densities and visualizing inclusions. Clinical data of breast lesions were analyzed, and porcine liver MWA experiments ex vivo were conducted to validate the performance of KNN entropy imaging in classifying benign and malignant breast tumors and monitoring thermal lesions, respectively. Compared with PDH, the entropy estimation based on KNN was less affected by the tuning parameters. KNN entropy imaging was more sensitive to changes in scatterer densities and performed better visualizable capability than typical Shannon entropy (TSE) and Nakagami-m parametric imaging. Among different imaging methods, KNN-based Shannon entropy (KSE) imaging achieved the higher accuracy in classification of benign and malignant breast tumors and KNN-based CRE imaging had larger lesion-to-normal contrast when monitoring the ablated areas during MWA at different powers and treatment durations. Ultrasound KNN entropy imaging is a potential quantitative ultrasound approach for tissue characterization.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    牛奶的质量与其品牌密切相关。一个知名品牌的牛奶总是质量很好。因此,本研究旨在设计一种新的模糊特征提取方法,称为模糊改进零线性判别分析(FINLDA),对收集的牛奶光谱进行聚类,以识别牛奶品牌。为了提高分类精度,将FiNLDA应用于处理便携式近红外光谱仪获得的牛奶近红外(NIR)光谱。主成分分析和Savitzky-Golay(SG)滤波算法用于降低该系统的维数并消除噪声。分别。此后,应用改进的零线性判别分析(iNLDA)和FiNLDA来获得近红外光谱的判别信息。最后,K最近邻分类器用于评估识别系统的性能。结果表明,LDA的最大分类精度,iNLDA和FiNLDA为74.7%,88%和94.67%,分别。因此,便携式近红外光谱仪与FINLDA相结合,可以正确有效地对牛奶品牌进行分类。
    The quality of milk is tightly linked to its brand. A famous brand of milk always has good quality. Therefore, this study seeks to design a new fuzzy feature extraction method, called fuzzy improved null linear discriminant analysis (FiNLDA), to cluster the spectra of collected milk for identifying milk brands. To elevate the classification accuracy, FiNLDA was applied to process the near-infrared (NIR) spectra of milk acquired by the portable near-infrared spectrometer. The principal component analysis and Savitzky-Golay (SG) filtering algorithm were employed to lower dimensionality and eliminate noise in this system, respectively. Thereafter, improved null linear discriminant analysis (iNLDA) and FiNLDA were applied to attain the discriminant information of the NIR spectra. At last, the K-nearest neighbor classifier was utilized for assessing the performance of the identification system. The results indicated that the maximum classification accuracies of LDA, iNLDA and FiNLDA were 74.7%, 88% and 94.67%, respectively. Accordingly, the portable NIR spectrometer in combination with FiNLDA can classify milk brands correctly and effectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在机器学习和数据分析中,降维和高维数据可视化可以通过使用t分布随机邻居嵌入(t-SNE)算法的流形学习来实现。通过为t-SNE算法引入预处理策略,我们显着改进了这种流形学习方案。在我们的预处理中,我们首先利用拉普拉斯特征映射来减少高维数据,它可以聚合每个数据集群,并显著减少Kullback-Leibler散度(KLD)。此外,k-近邻(KNN)算法也参与了我们的预处理,以提高可视化性能,降低计算和空间复杂度。我们将策略的性能与MNIST数据集上标准t-SNE的性能进行了比较。实验结果表明,我们的策略具有更强的分离能力,可以将相同类型的数据彼此更接近。此外,KLD可以减少约30%,而代价是仅将运行时的复杂性增加1-2%。
    In machine learning and data analysis, dimensionality reduction and high-dimensional data visualization can be accomplished by manifold learning using a t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm. We significantly improve this manifold learning scheme by introducing a preprocessing strategy for the t-SNE algorithm. In our preprocessing, we exploit Laplacian eigenmaps to reduce the high-dimensional data first, which can aggregate each data cluster and reduce the Kullback-Leibler divergence (KLD) remarkably. Moreover, the k-nearest-neighbor (KNN) algorithm is also involved in our preprocessing to enhance the visualization performance and reduce the computation and space complexity. We compare the performance of our strategy with that of the standard t-SNE on the MNIST dataset. The experiment results show that our strategy exhibits a stronger ability to separate different clusters as well as keep data of the same kind much closer to each other. Moreover, the KLD can be reduced by about 30% at the cost of increasing the complexity in terms of runtime by only 1-2%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阿尔茨海默病(AD)是一种进行性和不可逆的神经退行性脑疾病,具有显著的经济和社会影响,而早期AD诊断仍然是一个相当大的挑战。这里,在微阵列芯片上制作了一个强大而方便的表面增强拉曼散射(SERS)分析平台,以剖析血清成分的变化,用于AD诊断。消除侵入性脑脊液(CSF)为基础和昂贵的仪器依赖的诊断方法。通过在液-液界面自组装制备的AuNOs阵列能够以优异的再现性获得SERS光谱。此外,有限差分时域(FDTD)模拟表明,AuNO聚集产生了显著的等离子体激元杂交,导致高信噪比的SERS光谱。我们建立了Aβ1-40诱导的AD小鼠模型,然后记录了不同阶段的血清SERS光谱。采用基于主成分分析(PCA)加权表示的k近邻(WRKNN)的多变量分析方法进行特征提取,以提高分类性能。准确率超过95%,AUC超过90%,灵敏度超过80%,特异性超过96.7%。这项研究的结果证明了SERS作为诊断筛查方法的应用潜力,在进一步验证和优化后,这可能为未来的生物医学应用开辟新的令人兴奋的机会。
    Alzheimer\'s disease (AD) is a progressive and irreversible neurodegenerative brain disorder with significant economic and societal impacts, whereas early AD diagnosis remains a considerable challenge. Here, a robust and convenient surface-enhanced Raman scattering (SERS) analysis platform was fabricated on a microarray chip to dissect the variation in serum composition for AD diagnosis, eliminating the invasive cerebrospinal fluid (CSF)-based and costly instrument-dependent diagnostic methods. AuNOs array prepared by self-assembly at liquid-liquid interface enabled the acquirement of SERS spectra with excellent reproducibility. Moreover, a finite-difference time-domain (FDTD) simulation suggested the significant plasmon hybridization generated by AuNOs aggregation, resulting in high signal-to-noise ratio SERS spectra. We established an AD mice model with Aβ1-40 induction followed by recording the serum SERS spectra at different stages. A multivariate analysis method of principal component analysis (PCA)-weighted representation-based k-nearest neighbor (WRKNN) was applied for the characteristics extraction to improve the classification performance, with an accuracy of over 95 %, an AUC of over 90 %, a sensitivity of over 80 %, and a specificity of over 96.7 %. The results of this study demonstrate the potential of SERS application as a diagnostic screening method, following further validation and optimization, which may open up new exciting opportunities for future biomedical applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    三维(3D)生物打印是一种计算机控制的技术,它结合了生物因素和生物墨水,以逐层的方式打印出精确的3D结构。3D生物打印技术是一种基于快速成型和增材制造技术的新型组织工程技术,结合各种学科。除了在体外培养过程中存在的问题,生物打印程序也受到一些问题的困扰:(1)难以寻找合适的生物墨水来匹配打印参数以减少细胞损伤和死亡率;和(2)难以在打印过程中提高打印精度。数据驱动的机器学习算法具有强大的预测能力,在行为预测和新模型探索方面具有天然的优势。将机器学习算法与3D生物打印相结合,有助于找到更有效的生物墨水。确定打印参数,并检测打印过程中的缺陷。本文详细介绍了几种机器学习算法,总结了机器学习在增材制造应用中的作用,回顾了近年来3D生物打印与机器学习相结合的研究进展,特别是生物墨水生成的改进,优化打印参数,以及印刷缺陷的检测。
    Three-dimensional (3D) bioprinting is a computer-controlled technology that combines biological factors and bioinks to print an accurate 3D structure in a layer- by-layer fashion. 3D bioprinting is a new tissue engineering technology based on rapid prototyping and additive manufacturing technology, combined with various disciplines. In addition to the problems in in vitro culture process, the bioprinting procedure is also afflicted with a few issues: (1) difficulty in looking for the appropriate bioink to match the printing parameters to reduce cell damage and mortality; and (2) difficulty in improving the printing accuracy in the printing process. Data- driven machine learning algorithms with powerful predictive capabilities have natural advantages in behavior prediction and new model exploration. Combining machine learning algorithms with 3D bioprinting helps to find more efficient bioinks, determine printing parameters, and detect defects in the printing process. This paper introduces several machine learning algorithms in detail, summarizes the role of machine learning in additive manufacturing applications, and reviews the research progress of the combination of 3D bioprinting and machine learning in recent years, especially the improvement of bioink generation, the optimization of printing parameter, and the detection of printing defect.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    图自动编码器(GAE)是一种强大的图表示学习工具,以无监督的方式针对图数据进行学习。然而,大多数现有的基于GAE的方法通常侧重于通过重建邻接矩阵来保留图的拓扑结构,而忽略了对节点属性信息的保留。因此,无法完全学习节点属性,并且GAE学习更高质量表示的能力被削弱。为了解决这个问题,本文提出了一种新颖的GAE模型,该模型保留了节点属性的相似性。结构图和属性近邻图,它是基于节点之间的属性相似性构建的,使用有效的融合策略集成为编码器输入。在编码器中,节点的属性既可以在其结构邻域中聚合,也可以通过其属性邻域中的属性相似性聚合。这允许通过共享相同的编码器来执行节点表示中的结构和节点属性信息的融合。在解码器模块中,使用双解码器重建节点的邻接矩阵和属性相似度矩阵。利用重构邻接矩阵的交叉熵损失和重构节点属性相似度矩阵的均方误差损失来更新模型参数,保证节点表示保留原始结构和节点属性相似度信息。在三个引文网络上的大量实验表明,该方法在链接预测和节点聚类任务中的性能优于最先进的算法。
    The graph autoencoder (GAE) is a powerful graph representation learning tool in an unsupervised learning manner for graph data. However, most existing GAE-based methods typically focus on preserving the graph topological structure by reconstructing the adjacency matrix while ignoring the preservation of the attribute information of nodes. Thus, the node attributes cannot be fully learned and the ability of the GAE to learn higher-quality representations is weakened. To address the issue, this paper proposes a novel GAE model that preserves node attribute similarity. The structural graph and the attribute neighbor graph, which is constructed based on the attribute similarity between nodes, are integrated as the encoder input using an effective fusion strategy. In the encoder, the attributes of the nodes can be aggregated both in their structural neighborhood and by their attribute similarity in their attribute neighborhood. This allows performing the fusion of the structural and node attribute information in the node representation by sharing the same encoder. In the decoder module, the adjacency matrix and the attribute similarity matrix of the nodes are reconstructed using dual decoders. The cross-entropy loss of the reconstructed adjacency matrix and the mean-squared error loss of the reconstructed node attribute similarity matrix are used to update the model parameters and ensure that the node representation preserves the original structural and node attribute similarity information. Extensive experiments on three citation networks show that the proposed method outperforms state-of-the-art algorithms in link prediction and node clustering tasks.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    准确预测城市网约车需求对提高运营效率具有重要意义,减少交通拥堵和能源消耗。本文以2019年至2021年合肥市区网约车平台265天订单数据为例,并将每天分成48个时间单位(每单位30分钟),形成一个数据集。以平均绝对误差最小为优化目标,对历史数据集进行分类,并对K最近邻模型的状态向量T和参数K的值进行优化,解决了传统模型中由于T或K的固定值造成的预测误差问题。结论表明,K-近邻模型的预测精度可以达到93.62%,远高于指数平滑模型(81.65%),KNN1模型(84.02%),与LSTM模型(91.04%)相似,这意味着它可以适应城市网约车系统,并在潜在的应用方面具有价值。
    Accurately forecasting the demand of urban online car-hailing is of great significance to improving operation efficiency, reducing traffic congestion and energy consumption. This paper takes 265-day order data from the Hefei urban online car-hailing platform from 2019 to 2021 as an example, and divides each day into 48 time units (30 min per unit) to form a data set. Taking the minimum average absolute error as the optimization objective, the historical data sets are classified, and the values of the state vector T and the parameter K of the K-nearest neighbor model are optimized, which solves the problem of prediction error caused by fixed values of T or K in traditional model. The conclusion shows that the forecasting accuracy of the K-nearest neighbor model can reach 93.62%, which is much higher than the exponential smoothing model (81.65%), KNN1 model (84.02%) and is similar to LSTM model (91.04%), meaning that it can adapt to the urban online car-hailing system and be valuable in terms of its potential application.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在全球范围内,所有国家在其发展道路上都会遇到空气污染问题。作为空气质量的重要指标,PM2.5浓度早已被证明会影响人群的死亡率。被证明优于传统统计方法的机器学习算法被广泛用于空气污染预测。然而,关于模型预测结果的模型选择讨论和环境解释的研究仍然很少,迫切需要领导空气污染控制政策的制定。我们的研究比较了四种类型的机器学习算法LinearSVR,K-最近的邻居,套索回归,通过研究它们在预测不同城市和季节的PM2.5浓度方面的表现来提高梯度。结果表明,机器学习模型能够根据前五天的数据预测第二天的PM2.5浓度,具有较好的准确性。对比实验表明,基于城市水平的梯度提升预测模型具有更好的预测性能,平均绝对误差(MAE)为9ug/m3,均方根误差(RMSE)为10.25-16.76ug/m3,与其他三种模型相比,并且基于季节级别的四个模型在冬季具有最佳的预测性能,在夏季具有最差的预测性能。更重要的是,在每个城市和每个季节展示模型的不同表现对环境政策影响具有重要意义。
    Globally all countries encounter air pollution problems along their development path. As a significant indicator of air quality, PM2.5 concentration has long been proven to be affecting the population\'s death rate. Machine learning algorithms proven to outperform traditional statistical approaches are widely used in air pollution prediction. However research on the model selection discussion and environmental interpretation of model prediction results is still scarce and urgently needed to lead the policy making on air pollution control. Our research compared four types of machine learning algorisms LinearSVR, K-Nearest Neighbor, Lasso regression, Gradient boosting by looking into their performance in predicting PM2.5 concentrations among different cities and seasons. The results show that the machine learning model is able to forecast the next day PM2.5 concentration based on the previous five days\' data with better accuracy. The comparative experiments show that based on city level the Gradient Boosting prediction model has better prediction performance with mean absolute error (MAE) of 9 ug/m3 and root mean square error (RMSE) of 10.25-16.76 ug/m3, lower compared with the other three models, and based on season level four models have the best prediction performances in winter time and the worst in summer time. And more importantly the demonstration of models\' different performances in each city and each season is of great significance in environmental policy implications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基于临床评估的儿童双相情感障碍(PBD)的诊断有时会导致临床实践中的误诊。在过去的几年里,引入机器学习(ML)方法对双相情感障碍(BD)进行分类,这对BD的诊断有帮助。在这项研究中,我们从磁共振成像(MRI)数据中提取了33例PBD-I患者和19例年龄-性别匹配的健康对照(HCs)的大脑皮层厚度和皮层下体积,并将其设置为分类特征.降维的特征子集,由Lasso或f_classif过滤,被发送到六个分类器(逻辑回归(LR),支持向量机(SVM),随机森林分类器,天真贝叶斯,k-最近邻,和AdaBoost算法),并对分类器进行了训练和测试。在所有分类器中,准确率最高的前两个分类器是LR(84.19%)和SVM(82.80%)。在六种算法中进行特征选择,以获得最重要的变量,包括右颞中回和双侧苍白球,这与PBD患者这些脑区的结构和功能异常变化一致。这些发现使BD的计算机辅助诊断向前迈进了一步。
    The diagnosis based on clinical assessment of pediatric bipolar disorder (PBD) may sometimes lead to misdiagnosis in clinical practice. For the past several years, machine learning (ML) methods were introduced for the classification of bipolar disorder (BD), which were helpful in the diagnosis of BD. In this study, brain cortical thickness and subcortical volume of 33 PBD-I patients and 19 age-sex matched healthy controls (HCs) were extracted from the magnetic resonance imaging (MRI) data and set as features for classification. The dimensionality reduced feature subset, which was filtered by Lasso or f_classif, was sent to the six classifiers (logistic regression (LR), support vector machine (SVM), random forest classifier, naïve Bayes, k-nearest neighbor, and AdaBoost algorithm), and the classifiers were trained and tested. Among all the classifiers, the top two classifiers with the highest accuracy were LR (84.19%) and SVM (82.80%). Feature selection was performed in the six algorithms to obtain the most important variables including the right middle temporal gyrus and bilateral pallidum, which is consistent with structural and functional anomalous changes in these brain regions in PBD patients. These findings take the computer-aided diagnosis of BD a step forward.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号