使用集成机器学习技术的基于不平衡心电信号的心脏病分类。Imbalanced ECG signal-based heart disease classification using ensemble machine learning technique.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

The machine learning (ML)-based classification models are widely utilized for the automated detection of heart diseases (HDs) using various physiological signals such as electrocardiogram (ECG), magnetocardiography (MCG), heart sound (HS), and impedance cardiography (ICG) signals. However, ECG-based HD identification is the most common one used by clinicians. In the current investigation, the ECG records or subjects have been sampled and are used as inputs to the classification model to distinguish between normal and abnormal patients. The study has employed an imbalanced number of ECG samples for training the various classification models. Few ML methods such as support vector machine (SVM), logistic regression (LR), and adaptive boosting (AdaBoost) which have been rarely used for HD detection have been selected. The performance of the developed model has been evaluated in terms of accuracy, F1-score, and area under curve (AUC) values using ECG signals of subjects given in publicly available (PTB-ECG, MIT-BIH) datasets. Ranking of the models has been assigned based on these performance metrics and it is found that the AdaBoost and LR classifiers stand in first and second positions. These two models have been ensembled based on the majority voting principle and the performance measure of this ensemble model has also been determined. It is, in general, observed that the proposed ensemble model demonstrates the best HD detection performance of 0.946, 0.949, and 0.951 for the PTB-ECG dataset and 0.921, 0.926, and 0.950 for the MIT-BIH dataset in terms of accuracy, F1-score, and AUC, respectively. The proposed methodology can also be employed for the classification of HD using ICG, MCG, and HS signals as inputs. Further, the proposed methodology can also be applied to the detection of other diseases.

摘要：

基于机器学习（ML）的分类模型广泛用于使用各种生理信号（例如心电图（ECG），心磁图（MCG），心音（HS），和阻抗心动图(ICG)信号。然而,基于ECG的HD识别是临床医生最常用的一种。在目前的调查中,已对ECG记录或受试者进行了采样，并将其用作分类模型的输入，以区分正常和异常患者。该研究采用了不平衡数量的ECG样本来训练各种分类模型。少数机器学习方法，如支持向量机(SVM)，逻辑回归(LR)，和自适应增强(AdaBoost)已经选择了很少用于HD检测。已在准确性方面评估了开发模型的性能，F1分数，和使用公开提供的受试者的ECG信号的曲线下面积（AUC）值（PTB-ECG，MIT-BIH)数据集。已基于这些性能指标分配了模型的排名，并且发现AdaBoost和LR分类器处于第一和第二位置。这两个模型已经基于多数投票原则进行了整合，并且还确定了该整合模型的性能度量。是的,总的来说,观察到所提出的集成模型在准确性方面展示了PTB-ECG数据集的0.946、0.949和0.951以及MIT-BIH数据集的0.921、0.926和0.950的最佳HD检测性能，F1分数，AUC,分别。所提出的方法也可以用于使用ICG对HD进行分类，MCG,和HS信号作为输入。Further,所提出的方法也可以应用于其他疾病的检测。