K-nearest neighbours

k - 最近的邻居
  • 文章类型: Journal Article
    Breathing is one of the body\'s most basic functions and abnormal breathing can indicate underlying cardiopulmonary problems. Monitoring respiratory abnormalities can help with early detection and reduce the risk of cardiopulmonary diseases. In this study, a 77 GHz frequency-modulated continuous wave (FMCW) millimetre-wave (mmWave) radar was used to detect different types of respiratory signals from the human body in a non-contact manner for respiratory monitoring (RM). To solve the problem of noise interference in the daily environment on the recognition of different breathing patterns, the system utilised breathing signals captured by the millimetre-wave radar. Firstly, we filtered out most of the static noise using a signal superposition method and designed an elliptical filter to obtain a more accurate image of the breathing waveforms between 0.1 Hz and 0.5 Hz. Secondly, combined with the histogram of oriented gradient (HOG) feature extraction algorithm, K-nearest neighbours (KNN), convolutional neural network (CNN), and HOG support vector machine (G-SVM) were used to classify four breathing modes, namely, normal breathing, slow and deep breathing, quick breathing, and meningitic breathing. The overall accuracy reached up to 94.75%. Therefore, this study effectively supports daily medical monitoring.






  • 文章类型: Journal Article
    ASD (autism spectrum disorder) is a complex developmental and neurological disorder that impacts the social life of the affected person by disturbing their capability for interaction and communication. As it is a behavioural disorder, early treatment will improve the quality of life of ASD patients. Traditional screening is carried out with behavioural assessment through trained physicians, which is expensive and time-consuming. To resolve the issue, several conventional methods strive to achieve an effective ASD identification system, but are limited by handling large data sets, accuracy, and speed. Therefore, the proposed identification system employed the MBA (modified bat) algorithm based on ANN (artificial neural networks), modified ANN (modified artificial neural networks), DT (decision tree), and KNN (k-nearest neighbours) for the classification of ASD in children and adolescents. A BA (bat algorithm) is utilised for the automatic zooming capability, which improves the system\'s efficacy by excellently finding the solutions in the identification system. Conversely, BA is effective in the identification, it still has certain drawbacks like speed, accuracy, and falls into local extremum. Therefore, the proposed identification system modifies the BA optimisation with random perturbation of trends and optimal orientation. The dataset utilised in the respective model is the Q-chat-10 dataset. This dataset contains data of four stages of age groups such as toddlers, children, adolescents, and adults. To analyse the quality of the dataset, dataset evaluation mechanism, such as the Chi-Squared Statistic and p-value, are used in the respective research. The evaluation signifies the relation of the dataset with respect to the proposed model. Further, the performance of the proposed detection system is examined with certain performance metrics to calculate its efficiency. The outcome revealed that the modified ANN classifier model attained an accuracy of 1.00, ensuring improved performance when compared with other state-of-the-art methods. Thus, the proposed model was intended to assist physicians and researchers in enhancing the diagnosis of ASD to improve the standard of life of ASD patients.






  • 文章类型: Journal Article
    Activities practiced in the hospital generate several types of risks. Therefore, performing the risk assessment is one of the quality improvement keys in the healthcare sector. For this reason, healthcare managers need to design and perform efficient risk assessment processes. Failure modes and effects analysis (FMEA) is one of the most used risk assessment methods. The FMEA is a proactive technique consisting of the evaluation of failure modes associated with a studied process using three factors: occurrence, non-detection, and severity, in order to obtain the risk priority number using fuzzy logic approach and machine learning algorithms, namely the support vector machine and the k-nearest neighbours. The proposed model is applied in the case of the central sterilization unit of a tertiary national reference centre of dental treatment, where its efficiency is evaluated compared to the classical approach. These comparisons are based on expert advice and machine learning performance metrics. Our developed model proved high effectiveness throughout the results of the expert\'s vote (she agrees with 96% fuzzy-FMEA results against 6% with classical FMEA results). Furthermore, the machine learning metrics show a high level of accuracy in both training data (best rate is 96%) and testing data (90%). This study represents the first study that aims to perform artificial intelligence approach to risk management in the Moroccan healthcare sector. The perspective of this study is to promote the application of the artificial intelligence in Moroccan health management, especially in the field of quality and safety management.






  • 文章类型: Journal Article
    An essential yet challenging task is an automatic diagnosis of attention-deficit/hyperactivity disorder (ADHD) without manual intervention. The present study emphasises utilizing structural MRI and personal characteristic (PC) data for developing an automated diagnostic system for ADHD classification. Here, an age-balanced dataset of 316 ADHD and 316 Typically Developing Children (TDC) was prepared from the publicly available dataset. We extracted volumetric features from gray matter (GM) volumes from brain regions defined by Automated Anatomical Labelling (AAL3) atlas and cortical thickness-based (CT) features using the Destrieux atlas. A set of salient features were selected independently using minimum redundancy and maximum relevance (mRMR) and ensemble feature selection (EFS) methods. Decision models were trained using five well-known classifiers: K-nearest neighbours, logistic regression, linear Support Vector Machine (SVM), radial-based SVM (RBSVM), and Random Forest. The performance of the proposed system was evaluated using accuracy, recall, and specificity with ten runs of a ten-fold cross-validation scheme. We run seven experiments by considering different combinations of features. The maximum classification accuracy of 75% was obtained with CT and PC features with RBSVM and SVM with the EFS. An increase in GM volume in fifteen brain regions and loss of cortical thickness in twenty-seven brain regions were observed.






  • 文章类型: Journal Article
    Essential oils are valuable in various industries, but their easy adulteration can cause adverse health effects. Electronic nasal sensors offer a solution for adulteration detection. This article proposes a new system for characterising essential oils based on low-cost sensor networks and machine learning techniques. The sensors used belong to the MQ family (MQ-2, MQ-3, MQ-4, MQ-5, MQ-6, MQ-7, and MQ-8). Six essential oils were used, including Cistus ladanifer, Pinus pinaster, and Cistus ladanifer oil adulterated with Pinus pinaster, Melaleuca alternifolia, tea tree, and red fruits. A total of up to 7100 measurements were included, with more than 118 h of measurements of 33 different parameters. These data were used to train and compare five machine learning algorithms: discriminant analysis, support vector machine, k-nearest neighbours, neural network, and naive Bayesian when the data were used individually or when hourly mean values were included. To evaluate the performance of the included machine learning algorithms, accuracy, precision, recall, and F1-score were considered. The study found that using k-nearest neighbours, accuracy, recall, F1-score, and precision values were 1, 0.99, 0.99, and 1, respectively. The accuracy reached 100% with k-nearest neighbours using only 2 parameters for averaged data or 15 parameters for individual data.






  • 文章类型: Journal Article
    This paper proposes an efficient and fast method to create large datasets for machine learning algorithms applied to brain stroke classification via microwave imaging systems. The proposed method is based on the distorted Born approximation and linearization of the scattering operator, in order to minimize the time to generate the large datasets needed to train the machine learning algorithms. The method is then applied to a microwave imaging system, which consists of twenty-four antennas conformal to the upper part of the head, realized with a 3D anthropomorphic multi-tissue model. Each antenna acts as a transmitter and receiver, and the working frequency is 1 GHz. The data are elaborated with three machine learning algorithms: support vector machine, multilayer perceptron, and k-nearest neighbours, comparing their performance. All classifiers can identify the presence or absence of the stroke, the kind of stroke (haemorrhagic or ischemic), and its position within the brain. The trained algorithms were tested with datasets generated via full-wave simulations of the overall system, considering also slightly modified antennas and limiting the data acquisition to amplitude only. The obtained results are promising for a possible real-time brain stroke classification.






  • 文章类型: Journal Article
    Electroencephalography is one of the most commonly used methods for extracting information about the brain\'s condition and can be used for diagnosing epilepsy. The EEG signal\'s wave shape contains vital information about the brain\'s state, which can be challenging to analyse and interpret by a human observer. Moreover, the characteristic waveforms of epilepsy (sharp waves, spikes) can occur randomly through time. Considering all the above reasons, automatic EEG signal extraction and analysis using computers can significantly impact the successful diagnosis of epilepsy. This research explores the impact of different window sizes on EEG signals\' classification accuracy using four machine learning classifiers. The machine learning methods included a neural network with ten hidden nodes trained using three different training algorithms and the k-nearest neighbours classifier. The neural network training methods included the Broyden-Fletcher-Goldfarb-Shanno algorithm, the multistart method for global optimization problems, and a genetic algorithm. The current research utilized the University of Bonn dataset containing EEG data, divided into epochs having 50% overlap and window lengths ranging from 1 to 24 s. Then, statistical and spectral features were extracted and used to train the above four classifiers. The outcome from the above experiments showed that large window sizes with a length of about 21 s could positively impact the classification accuracy between the compared methods.






  • 文章类型: Journal Article
    Seasonal variations (SVs) affect the population density (PD), fate, and fitness of pathogens in environmental water resources and the public health impacts. Therefore, this study is aimed at applying machine learning intelligence (MLI) to predict the impacts of SVs on P. shigelloides population density (PDP) in the aquatic milieu. Physicochemical events (PEs) and PDP from three rivers acquired via standard microbiological and instrumental techniques across seasons were fitted to MLI algorithms (linear regression (LR), multiple linear regression (MR), random forest (RF), gradient boosted machine (GBM), neural network (NN), K-nearest neighbour (KNN), boosted regression tree (BRT), extreme gradient boosting (XGB) regression, support vector regression (SVR), decision tree regression (DTR), M5 pruned regression (M5P), artificial neural network (ANN) regression (with one 10-node hidden layer (ANN10), two 6- and 4-node hidden layers (ANN64), and two 5- and 5-node hidden layers (ANN55)), and elastic net regression (ENR)) to assess the implications of the SVs of PEs on aquatic PDP. The results showed that SVs significantly influenced PDP and PEs in the water (p < 0.0001), exhibiting a site-specific pattern. While MLI algorithms predicted PDP with differing absolute flux magnitudes for the contributing variables, DTR predicted the highest PDP value of 1.707 log unit, followed by XGB (1.637 log unit), but XGB (mean-squared-error (MSE) = 0.0025; root-mean-squared-error (RMSE) = 0.0501; R2 =0.998; medium absolute deviation (MAD) = 0.0275) outperformed other models in terms of regression metrics. Temperature and total suspended solids (TSS) ranked first and second as significant factors in predicting PDP in 53.3% (8/15) and 40% (6/15), respectively, of the models, based on the RMSE loss after permutations. Additionally, season ranked third among the 7 models, and turbidity (TBS) ranked fourth at 26.7% (4/15), as the primary significant factor for predicting PDP in the aquatic milieu. The results of this investigation demonstrated that MLI predictive modelling techniques can promisingly be exploited to complement the repetitive laboratory-based monitoring of PDP and other pathogens, especially in low-resource settings, in response to seasonal fluxes and can provide insights into the potential public health risks of emerging pathogens and TSS pollution (e.g., nanoparticles and micro- and nanoplastics) in the aquatic milieu. The model outputs provide low-cost and effective early warning information to assist watershed managers and fish farmers in making appropriate decisions about water resource protection, aquaculture management, and sustainable public health protection.






  • 文章类型: Journal Article
    Forecasting municipal solid waste (MSW) generation and composition plays an essential role in effective waste management, policy decision-making and the MSW treatment process. An intelligent forecasting system could be used for short-term and long-term waste handling, ensuring a circular economy and a sustainable use of resources. This study contributes to the field by proposing a hybrid k-nearest neighbours (H-kNN) approach to forecasting municipal solid waste and its composition in the regions that experience data incompleteness and inaccessibility, as is the case for Lithuania and many other countries. For this purpose, the average MSW generation of neighbouring municipalities, as a geographical factor, was used to impute missing values, and socioeconomic factors together with demographic indicator affecting waste collected in municipalities were identified and quantified using correlation analysis. Among them, the most influential factors, such as population density, GDP per capita, private property, foreign investment per capita, and tourism, were then incorporated in the hierarchical setting of the H-kNN approach. The results showed that, in forecasting MSW generation, H-kNN achieved MAPE of 11.05%, on average, including all Lithuanian municipalities, which is by 7.17 percentage points lower than obtained using kNN. This implies that by finding relevant factors at the municipal level, we can compensate for the data incompleteness and enhance the forecasting results of MSW generation and composition.






  • 文章类型: Journal Article
    This paper presents a learning system with a K-nearest neighbour classifier to classify the wear condition of a multi-piston positive displacement pump. The first part reviews current built diagnostic methods and describes typical failures of multi-piston positive displacement pumps and their causes. Next is a description of a diagnostic experiment conducted to acquire a matrix of vibration signals from selected locations in the pump body. The measured signals were subjected to time-frequency analysis. The signal features calculated in the time and frequency domain were grouped in a table according to the wear condition of the pump. The next step was to create classification models of a pump wear condition and assess their accuracy. The selected model, which best met the set criteria for accuracy assessment, was verified with new measurement data. The article ends with a summary.





