基于监督机器学习技术的集成策略在脊柱病理诊断中的应用 [J].Vertebral Column Pathology Diagnosis Using Ensemble Strategies Based on Supervised Machine Learning Techniques.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

One expanding area of bioinformatics is medical diagnosis through the categorization of biomedical characteristics. Automatic medical strategies to boost the diagnostic through machine learning (ML) methods are challenging. They require a formal examination of their performance to identify the best conditions that enhance the ML method. This work proposes variants of the Voting and Stacking (VC and SC) ensemble strategies based on diverse auto-tuning supervised machine learning techniques to increase the efficacy of traditional baseline classifiers for the automatic diagnosis of vertebral column orthopedic illnesses. The ensemble strategies are created by first combining a complete set of auto-tuned baseline classifiers based on different processes, such as geometric, probabilistic, logic, and optimization. Next, the three most promising classifiers are selected among k-Nearest Neighbors (kNN), Naïve Bayes (NB), Logistic Regression (LR), Linear Discriminant Analysis (LDA), Quadratic Discriminant Analysis (QDA), Support Vector Machine (SVM), Artificial Neural Networks (ANN), and Decision Tree (DT). The grid-search K-Fold cross-validation strategy is applied to auto-tune the baseline classifier hyperparameters. The performances of the proposed ensemble strategies are independently compared with the auto-tuned baseline classifiers. A concise analysis evaluates accuracy, precision, recall, F1-score, and ROC-ACU metrics. The analysis also examines the misclassified disease elements to find the most and least reliable classifiers for this specific medical problem. The results show that the VC ensemble strategy provides an improvement comparable to that of the best baseline classifier (the kNN). Meanwhile, when all baseline classifiers are included in the SC ensemble, this strategy surpasses 95% in all the evaluated metrics, standing out as the most suitable option for classifying vertebral column diseases.

摘要：

生物信息学的一个扩展领域是通过对生物医学特征进行分类的医学诊断。通过机器学习（ML）方法来提高诊断能力的自动医疗策略具有挑战性。他们需要对其性能进行正式检查，以确定增强ML方法的最佳条件。这项工作提出了基于多种自动调整监督机器学习技术的VotingandStacking（VC和SC）集成策略的变体，以提高传统基线分类器自动诊断脊柱骨科疾病的功效。集成策略是通过首先组合一组完整的基于不同过程的自动调谐基线分类器来创建的，如几何，概率，逻辑,和优化。接下来,三个最有前途的分类器在k-最近邻居(kNN)中选择，朴素贝叶斯(NB)，逻辑回归(LR)，线性判别分析(LDA)，二次判别分析(QDA)，支持向量机(SVM)人工神经网络(ANN)，决策树(DT)。网格搜索K-Fold交叉验证策略用于自动调整基线分类器超参数。所提出的集成策略的性能独立地与自动调谐的基线分类器进行比较。简洁的分析评估准确性，精度,召回，F1分数，和ROC-ACU指标。分析还检查了错误分类的疾病元素，以找到针对此特定医学问题的最可靠和最不可靠的分类器。结果表明，VC集成策略提供了与最佳基线分类器（kNN）相当的改进。同时,当所有基线分类器都包含在SC集成中时，该策略在所有评估指标中超过95%，突出作为分类脊柱疾病的最合适的选择。