临床研究中类别不平衡和完全分离下的模糊逻辑回归二元分类。Binary classification with fuzzy logistic regression under class imbalance and complete separation in clinical studies.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: In binary classification for clinical studies, an imbalanced distribution of cases to classes and an extreme association level between the binary dependent variable and a subset of independent variables can create significant classification problems. These crucial issues, namely class imbalance and complete separation, lead to classification inaccuracy and biased results in clinical studies.
METHODS: To deal with class imbalance and complete separation problems, we propose using a fuzzy logistic regression framework for binary classification. Fuzzy logistic regression incorporates combinations of triangular fuzzy numbers for the coefficients, inputs, and outputs and produces crisp classification results. The fuzzy logistic regression framework shows strong classification performance due to fuzzy logic\'s better handling of imbalance and separation issues. Hence, classification accuracy is improved, mitigating the risk of misclassified conditions and biased insights for clinical study patients.
RESULTS: The performance of the fuzzy logistic regression model is assessed on twelve binary classification problems with clinical datasets. The model has consistently high sensitivity, specificity, F1, precision, and Mathew\'s correlation coefficient scores across all clinical datasets. There is no evidence of impact from the imbalance or separation that exists in the datasets. Furthermore, we compare the fuzzy logistic regression classification performance against two versions of classical logistic regression and six different benchmark sources in the literature. These six sources provide a total of ten different proposed methodologies, and the comparison occurs by calculating the same set of classification performance scores for each method. Either imbalance or separation impacts seven out of ten methodologies. The remaining three produce better classification performance in their respective clinical studies. However, these are all outperformed by the fuzzy logistic regression framework.
CONCLUSIONS: Fuzzy logistic regression showcases strong performance against imbalance and separation, providing accurate predictions and, hence, informative insights for classifying patients in clinical studies.

摘要：

背景：在临床研究的二元分类中，案例到类的不平衡分布以及二元因变量和独立变量子集之间的极端关联水平可能会产生重大的分类问题。这些关键问题,即阶级不平衡和完全分离，导致临床研究中分类不准确和结果有偏差。
方法：为了处理类不平衡和完成分离问题，我们建议使用模糊逻辑回归框架进行二元分类。模糊逻辑回归结合了系数的三角模糊数的组合，输入，并输出并产生清晰的分类结果。由于模糊逻辑对不平衡和分离问题的更好处理，模糊逻辑回归框架显示出强大的分类性能。因此，提高了分类精度，降低临床研究患者的错误分类条件和偏颇见解的风险。
结果：在具有临床数据集的十二个二元分类问题上评估了模糊逻辑回归模型的性能。该模型具有一贯的高灵敏度，特异性，F1，精度，和所有临床数据集的Mathew相关系数得分。没有证据表明数据集中存在的不平衡或分离会产生影响。此外，我们将模糊逻辑回归分类性能与经典逻辑回归的两个版本和文献中的六个不同的基准来源进行比较。这六个来源总共提供了十种不同的拟议方法，并且通过计算每种方法的相同分类性能分数集来进行比较。不平衡或分离会影响十分之七的方法。其余三个在各自的临床研究中产生更好的分类性能。然而,这些都优于模糊逻辑回归框架。
结论：模糊逻辑回归显示了对不平衡和分离的强大表现，提供准确的预测，因此，在临床研究中对患者进行分类的信息见解。