Classification Algorithms

分类算法
  • 文章类型: Journal Article
    背景:该研究旨在确定与CVD相关的最关键的参数,并采用新颖的数据集成细化程序来揭示这些参数的最佳模式,这可以导致高预测精度。
    结果:总共收集了369名患者的数据,281名患有CVD或有发展风险的患者,与88个其他健康的人相比。在281名心血管疾病或高危患者中,53例被诊断为冠状动脉疾病(CAD),16患有终末期肾病,47例新诊断为2型糖尿病和92例慢性炎症性疾病(21类风湿性关节炎,41牛皮癣,30血管炎)。使用基于人工智能的算法分析数据,其主要目的是识别定义CVD的参数的最佳模式。该研究强调了使用DERGA和ExtraTrees算法识别心血管疾病可能性的六参数组合的有效性。这些参数,按重要性排序,包括血小板衍生的微囊泡(PMV),高血压,年龄,吸烟,血脂异常,身体质量指数(BMI)。内皮和红细胞MV,与糖尿病一起是最不重要的预测因素。此外,达到的最高预测精度为98.64%。值得注意的是,单独使用PMV可以获得91.32%的准确率,而采用所有十个参数的最优模型,得到的预测精度为0.9783(97.83%)。
    结论:我们的研究显示了DERGA的疗效,一种创新的数据集成细化贪婪算法。DERGA加速评估个体发生CVD的风险,允许早期诊断,显著减少所需实验室测试的数量,并优化资源利用率。此外,它有助于确定对评估CVD敏感性至关重要的最佳参数,从而增强我们对潜在机制的理解。
    BACKGROUND: The study aimed to determine the most crucial parameters associated with CVD and employ a novel data ensemble refinement procedure to uncover the optimal pattern of these parameters that can result in a high prediction accuracy.
    RESULTS: Data were collected from 369 patients in total, 281 patients with CVD or at risk of developing it, compared to 88 otherwise healthy individuals. Within the group of 281 CVD or at-risk patients, 53 were diagnosed with coronary artery disease (CAD), 16 with end-stage renal disease, 47 newly diagnosed with diabetes mellitus 2 and 92 with chronic inflammatory disorders (21 rheumatoid arthritis, 41 psoriasis, 30 angiitis). The data were analyzed using an artificial intelligence-based algorithm with the primary objective of identifying the optimal pattern of parameters that define CVD. The study highlights the effectiveness of a six-parameter combination in discerning the likelihood of cardiovascular disease using DERGA and Extra Trees algorithms. These parameters, ranked in order of importance, include Platelet-derived Microvesicles (PMV), hypertension, age, smoking, dyslipidemia, and Body Mass Index (BMI). Endothelial and erythrocyte MVs, along with diabetes were the least important predictors. In addition, the highest prediction accuracy achieved is 98.64%. Notably, using PMVs alone yields a 91.32% accuracy, while the optimal model employing all ten parameters, yields a prediction accuracy of 0.9783 (97.83%).
    CONCLUSIONS: Our research showcases the efficacy of DERGA, an innovative data ensemble refinement greedy algorithm. DERGA accelerates the assessment of an individual\'s risk of developing CVD, allowing for early diagnosis, significantly reduces the number of required lab tests and optimizes resource utilization. Additionally, it assists in identifying the optimal parameters critical for assessing CVD susceptibility, thereby enhancing our understanding of the underlying mechanisms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由抗生素的过度使用和生物膜的发展引起的多药耐药细菌(MRB)的出现和迅速传播,对全球公共卫生构成了越来越大的威胁。纳米颗粒作为抗生素的替代品被证明具有通过新的抗微生物机制应对MRB感染的实质性能力。特别是,具有独特(生物)物理化学特性的碳点(CD)在通过破坏细菌壁来对抗MRB方面受到了相当大的关注,与DNA或酶结合,局部诱导高温,或形成活性氧。
    这里,在机器学习(ML)工具的帮助下,研究了各种CD的物理化学特征如何影响其抗菌能力。
    首先收集来自121个样品的CD的合成条件和固有特性,以形成原始数据集,以最小抑制浓度(MIC)为输出。四种分类算法(KNN,SVM,射频,和XGBoost)用输入数据进行训练和验证。发现集成学习方法在我们的数据上是最好的。此外,开发了ε-聚(L-赖氨酸)CD(PL-CD),以验证经过良好训练的ML模型在实验室中的实际应用能力,该模型具有两个管理预测的集成模型。
    因此,我们的结果表明,基于ML的高通量理论计算可用于预测和解码CD特性与抗菌效果之间的关系,加速高性能纳米粒子的开发和潜在的临床翻译。
    UNASSIGNED: The emergence and rapid spread of multidrug-resistant bacteria (MRB) caused by the excessive use of antibiotics and the development of biofilms have been a growing threat to global public health. Nanoparticles as substitutes for antibiotics were proven to possess substantial abilities for tackling MRB infections via new antimicrobial mechanisms. Particularly, carbon dots (CDs) with unique (bio)physicochemical characteristics have been receiving considerable attention in combating MRB by damaging the bacterial wall, binding to DNA or enzymes, inducing hyperthermia locally, or forming reactive oxygen species.
    UNASSIGNED: Herein, how the physicochemical features of various CDs affect their antimicrobial capacity is investigated with the assistance of machine learning (ML) tools.
    UNASSIGNED: The synthetic conditions and intrinsic properties of CDs from 121 samples are initially gathered to form the raw dataset, with Minimum inhibitory concentration (MIC) being the output. Four classification algorithms (KNN, SVM, RF, and XGBoost) are trained and validated with the input data. It is found that the ensemble learning methods turn out to be the best on our data. Also, ε-poly(L-lysine) CDs (PL-CDs) were developed to validate the practical application ability of the well-trained ML models in a laboratory with two ensemble models managing the prediction.
    UNASSIGNED: Thus, our results demonstrate that ML-based high-throughput theoretical calculation could be used to predict and decode the relationship between CD properties and the anti-bacterial effect, accelerating the development of high-performance nanoparticles and potential clinical translation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    最近,各种机器学习方法已被广泛用于有效诊断和预测癌症等疾病,甲状腺,Covid-19等。同样,阿尔茨海默病(AD)也是一种进行性疾病,随着时间的推移会破坏记忆和认知功能。不幸的是,没有专门的基于AI的AD诊断解决方案与医疗诊断齐头并进,尽管多种因素有助于诊断,使AI成为非常可行的辅助诊断解决方案。本文报告了应用各种机器学习算法的努力,如SGD,k-最近的邻居,Logistic回归,决策树,随机森林,AdaBoost,神经网络,SVM,和朴素贝叶斯对受影响受害者的数据集进行诊断阿尔茨海默病。来自OASIS数据集的受试者的纵向集合已用于预测。此外,一些特征选择和降维方法,如信息增益,信息增益比,基尼系数,卡方,和PCA用于对不同因素进行排序,并从数据集中确定用于疾病诊断的最佳因素数。此外,根据ROC-AUC评估每个分类器的性能,准确度,F1得分,召回,和精度,以及包括算法之间的比较分析。我们的研究表明,在最高评级的四个功能CDR下观察到大约90%的分类准确率,SES,nWBV,和EDUC。
    In recent times, various machine learning approaches have been widely employed for effective diagnosis and prediction of diseases like cancer, thyroid, Covid-19, etc. Likewise, Alzheimer\'s (AD) is also one progressive malady that destroys memory and cognitive function over time. Unfortunately, there are no dedicated AI-based solutions for diagnoses of AD to go hand in hand with medical diagnosis, even though multiple factors contribute to the diagnosis, making AI a very viable supplementary diagnostic solution. This paper reports an endeavor to apply various machine learning algorithms like SGD, k-Nearest Neighbors, Logistic Regression, Decision tree, Random Forest, AdaBoost, Neural Network, SVM, and Naïve Bayes on the dataset of affected victims to diagnose Alzheimer\'s disease. Longitudinal collections of subjects from OASIS dataset have been used for prediction. Moreover, some feature selection and dimension reduction methods like Information Gain, Information Gain Ratio, Gini index, Chi-Squared, and PCA are applied to rank different factors and identify the optimum number of factors from the dataset for disease diagnosis. Furthermore, performance is evaluated of each classifier in terms of ROC-AUC, accuracy, F1 score, recall, and precision as well as included comparative analysis between algorithms. Our study suggests that approximately 90% classification accuracy is observed under top-rated four features CDR, SES, nWBV, and EDUC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    补体抑制在各种疾病中显示出希望,包括COVID-19。包括补体遗传变异的预测工具至关重要。这项研究旨在确定关键的补体相关变异,并确定准确预测疾病结果的最佳模式。使用基于人工智能的算法分析了2020年4月至2021年4月在三个转诊中心住院的204例COVID-19患者的遗传数据,以预测疾病结局(ICU与非ICU入院)。最近引入的α指数确定了30种最具预测性的遗传变异。DERGA算法,采用多种分类算法,确定了这些关键变体的最佳模式,预测疾病结果的准确率为97%。每个患者的个体差异从40到161个变异,检测到977种变体。这项研究证明了α指数在对大量遗传变异进行排名中的实用性。这种方法能够实现完善的分类算法,有效地确定遗传变异在高精度预测结果中的相关性。
    Complement inhibition has shown promise in various disorders, including COVID-19. A prediction tool including complement genetic variants is vital. This study aims to identify crucial complement-related variants and determine an optimal pattern for accurate disease outcome prediction. Genetic data from 204 COVID-19 patients hospitalized between April 2020 and April 2021 at three referral centres were analysed using an artificial intelligence-based algorithm to predict disease outcome (ICU vs. non-ICU admission). A recently introduced alpha-index identified the 30 most predictive genetic variants. DERGA algorithm, which employs multiple classification algorithms, determined the optimal pattern of these key variants, resulting in 97% accuracy for predicting disease outcome. Individual variations ranged from 40 to 161 variants per patient, with 977 total variants detected. This study demonstrates the utility of alpha-index in ranking a substantial number of genetic variants. This approach enables the implementation of well-established classification algorithms that effectively determine the relevance of genetic variants in predicting outcomes with high accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们讨论了基于化学电阻传感器上的低频噪声测量的气体传感系统的实施挑战。各种气体传感材料的电阻波动,在通常高达几kHz的频率范围内,可以通过考虑其强度和功率谱密度的斜率来增强气体传感。电阻式气体传感器中的低频噪声测量问题,特别是在具有气体传感特性的二维材料中,被考虑。我们介绍了气体检测的测量设置和噪声处理方法。化学电阻传感器示出了需要不同闪烁噪声测量方法的各种DC电阻。单独的噪声测量设置用于高达几百kΩ的电阻和具有高得多的值的电阻。高电阻材料中的噪声测量(例如,MoS2,WS2和ZrS3)易于受到外部干扰,但可以使用温度或光照射进行调制以增强感测。因此,这样的材料对于气体感测是相当感兴趣的。
    We discuss the implementation challenges of gas sensing systems based on low-frequency noise measurements on chemoresistive sensors. Resistance fluctuations in various gas sensing materials, in a frequency range typically up to a few kHz, can enhance gas sensing by considering its intensity and the slope of power spectral density. The issues of low-frequency noise measurements in resistive gas sensors, specifically in two-dimensional materials exhibiting gas-sensing properties, are considered. We present measurement setups and noise-processing methods for gas detection. The chemoresistive sensors show various DC resistances requiring different flicker noise measurement approaches. Separate noise measurement setups are used for resistances up to a few hundred kΩ and for resistances with much higher values. Noise measurements in highly resistive materials (e.g., MoS2, WS2, and ZrS3) are prone to external interferences but can be modulated using temperature or light irradiation for enhanced sensing. Therefore, such materials are of considerable interest for gas sensing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: English Abstract
    Necrotizing enterocolitis (NEC), with the main manifestations of bloody stool, abdominal distension, and vomiting, is one of the leading causes of death in neonates, and early identification and diagnosis are crucial for the prognosis of NEC. The emergence and development of machine learning has provided the potential for early, rapid, and accurate identification of this disease. This article summarizes the algorithms of machine learning recently used in NEC, analyzes the high-risk predictive factors revealed by these algorithms, evaluates the ability and characteristics of machine learning in the etiology, definition, and diagnosis of NEC, and discusses the challenges and prospects for the future application of machine learning in NEC.
    新生儿坏死性小肠结肠炎(neonatal necrotizing enterocolitis,NEC)以血便、腹胀、呕吐等为主要表现,是导致新生儿死亡的主要原因之一,早期识别和诊断对该病预后极为重要。机器学习的兴起和发展为早期、快速、准确识别该病提供了可能。该文总结近年来机器学习在NEC应用中的算法,分析算法揭示的高危预测因子,评价机器学习在NEC病因回溯、定义、诊断方面的能力和特点,探讨机器学习在NEC未来应用中的挑战及前景。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肌电图(EMG)是一种生物信息,它在许多领域被用来帮助人们研究人体肌肉运动,特别是在仿生手的研究中。EMG信号可以通过人体肌肉的信号变化来解释某一时刻的活动,这是一个非常复杂的信号,所以处理是非常重要的。肌电信号的采集过程可以分为,预处理,特征提取,和分类。并非所有信号通道都可用于EMG采集,重要的是要从中选择有用的信号。因此,本研究提出了一种特征提取方法,从八通道信号中提取最具代表性的两通道信号。在本文中,传统的主成分分析方法和支持向量机特征消除方法来提取信号通道。同时,一种新方法,相关热图,提出了用三种方法实现特征提取的方法,以及三种K近邻分类算法,随机森林,和支持向量机进行验证。结果表明,该方法的分类精度优于其他两种传统方法。
    Electromyography (EMG) is a form of biological information, which is used in many fields to help people study human muscle movement, especially in the study of bionic hands. EMG signals can be used to explain the activity at a certain moment through the signal changes of human muscles, and it is a very complex signal, so processing it is very important. The process of EMG signals can be divided into acquisition, pre-processing, feature extraction, and classification. Not all signal channels are useful in EMG acquisition, and it is important to select useful signals among them. Therefore, this study proposes a feature extraction method to extract the most representative two-channel signals from the eight-channel signals. In this paper, the traditional principal component analysis method and support vector machine feature elimination are used to extract signal channels. At the same time, a new method, correlation heat map, is proposed to implement feature extraction method by using three methods, and three classification algorithms of K-nearest neighbor, random forest, and support vector machine are used to verify. The results show that the classification accuracy of the proposed method is better than that of the other two traditional methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本文提出了一种基于机器学习的上市公司财务舞弊识别和预测新方法。我们收集了18060笔交易和363项财务指标,包括362个财务变量和一个类变量。然后,我们删除了与财务欺诈无关的9项指标,并处理了缺失值。之后,我们基于多特征选择模型和所有算法中特征的出现频率,从353个对财务舞弊有较大影响的指标中提取出13个指标。然后,建立了5种单一分类模型和3种集成模型对上市公司财务舞弊记录进行预测,包括LR,射频,XGBOOST,SVM,以及带有投票分类器的DT和集成模型。最后,我们从5种机器学习算法中选择了最优的单一模型和所有混合模型中的最优集成模型。在选择模型参数时,利用网格搜索法,并比较模型的几种评价指标,选择最优参数。结果确定最佳单一模型的精度在97%至99%的范围内,合奏模型的比例高于99%。这表明,最优集成模型表现良好,可以有效地预测和检测公司的欺诈活动。因此,将逻辑回归模型与XGBOOST模型相结合的混合模型是所有模型中最好的。在未来,它不仅能够预测公司管理中的欺诈行为,而且可以减轻这样做的负担。
    This paper proposes a new method that can identify and predict financial fraud among listed companies based on machine learning. We collected 18,060 transactions and 363 indicators of finance, including 362 financial variables and a class variable. Then, we eliminated 9 indicators which were not related to financial fraud and processed the missing values. After that, we extracted 13 indicators from 353 indicators which have a big impact on financial fraud based on multiple feature selection models and the frequency of occurrence of features in all algorithms. Then, we established five single classification models and three ensemble models for the prediction of financial fraud records of listed companies, including LR, RF, XGBOOST, SVM, and DT and ensemble models with a voting classifier. Finally, we chose the optimal single model from five machine learning algorithms and the best ensemble model among all hybrid models. In choosing the model parameter, optimal parameters were selected by using the grid search method and comparing several evaluation metrics of models. The results determined the accuracy of the optimal single model to be in a range from 97% to 99%, and that of the ensemble models as higher than 99%. This shows that the optimal ensemble model performs well and can efficiently predict and detect fraudulent activity of companies. Thus, a hybrid model which combines a logistic regression model with an XGBOOST model is the best among all models. In the future, it will not only be able to predict fraudulent behavior in company management but also reduce the burden of doing so.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    The performance of classical security authentication models can be severely affected by imperfect channel estimation as well as time-varying communication links. The commonly used approach of statistical decisions for the physical layer authenticator faces significant challenges in a dynamically changing, non-stationary environment. To address this problem, this paper introduces a deep learning-based authentication approach to learn and track the variations of channel characteristics, and thus improving the adaptability and convergence of the physical layer authentication. Specifically, an intelligent detection framework based on a Convolutional-Long Short-Term Memory (Convolutional-LSTM) network is designed to deal with channel differences without knowing the statistical properties of the channel. Both the robustness and the detection performance of the learning authentication scheme are analyzed, and extensive simulations and experiments show that the detection accuracy in time-varying environments is significantly improved.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    While the RT-PCR is the silver bullet test for confirming the COVID-19 infection, it is limited by the lack of reagents, time-consuming, and the need for specialized labs. As an alternative, most of the prior studies have focused on Chest CT images and Chest X-Ray images using deep learning algorithms. However, these two approaches cannot always be used for patients\' screening due to the radiation doses, high costs, and the low number of available devices. Hence, there is a need for a less expensive and faster diagnostic model to identify the positive and negative cases of COVID-19. Therefore, this study develops six predictive models for COVID-19 diagnosis using six different classifiers (i.e., BayesNet, Logistic, IBk, CR, PART, and J48) based on 14 clinical features. This study retrospected 114 cases from the Taizhou hospital of Zhejiang Province in China. The results showed that the CR meta-classifier is the most accurate classifier for predicting the positive and negative COVID-19 cases with an accuracy of 84.21%. The results could help in the early diagnosis of COVID-19, specifically when the RT-PCR kits are not sufficient for testing the infection and assist countries, specifically the developing ones that suffer from the shortage of RT-PCR tests and specialized laboratories.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号