目的:心血管疾病(CD)是全球主要的健康问题,影响数百万人的症状,如疲劳和胸部不适。及时识别是至关重要的,因为它对全球死亡率的重大贡献。在医疗保健方面,人工智能(AI)有望推进疾病风险评估和治疗结果预测。然而,机器学习(ML)的发展引发了人们对数据隐私和偏见的担忧,特别是在敏感的医疗保健应用。目标是开发和实施一个负责任的CD预测AI模型,优先考虑患者隐私,安全,确保透明度,可解释性,公平,以及医疗保健应用中的道德遵守。
方法:为了预测CD,同时优先考虑患者的隐私,我们的研究采用了数据匿名化,包括将拉普拉斯噪声添加到年龄和性别等敏感特征中.使用差分隐私(DP)框架对匿名数据集进行分析以保护数据隐私。DP在提取见解的同时确保了机密性。与Logistic回归(LR)相比,高斯朴素贝叶斯(GNB),和随机森林(RF),该方法集成了特征选择,统计分析,和SHapley加法扩张(SHAP)和局部可解释模型不可知解释(LIME)的可解释性。这种方法有助于透明和可解释的人工智能决策,符合负责任的AI开发原则。总的来说,它结合了隐私保护,可解释性,以及准确的CD预测的道德考虑。
结果:我们对LR的DP框架的研究很有希望,曲线下面积(AUC)为0.848±0.03,准确率为0.797±0.02,准确率为0.789±0.02,召回率为0.797±0.02,F1评分为0.787±0.02,与非隐私框架具有可比性。基于SHAP和LIME的结果支持临床发现,表现出对透明和可解释的人工智能决策的承诺,并符合负责任的AI开发原则。
结论:我们的研究支持一种预测CD的新方法,合并数据匿名化,隐私保护方法,可解释性工具SHAP,LIME,和道德考虑。这个负责任的AI框架确保了准确的预测,隐私保护,和用户信任,强调全面和透明的ML模型在医疗保健中的重要性。因此,这项研究增强了预测CD的能力,为全球数百万CD患者提供重要的生命线,并可能防止大量死亡。
OBJECTIVE: Cardiovascular disease (CD) is a major global health concern, affecting millions with symptoms like fatigue and chest discomfort. Timely identification is crucial due to its significant contribution to global mortality. In healthcare, artificial intelligence (AI) holds promise for advancing disease risk assessment and treatment outcome prediction. However, machine learning (ML) evolution raises concerns about data privacy and biases, especially in sensitive healthcare applications. The objective is to develop and implement a responsible AI model for CD prediction that prioritize patient privacy, security, ensuring transparency, explainability, fairness, and ethical adherence in healthcare applications.
METHODS: To predict CD while prioritizing patient privacy, our study employed data anonymization involved adding Laplace noise to sensitive features like age and gender. The anonymized dataset underwent analysis using a differential privacy (DP) framework to preserve data privacy. DP ensured confidentiality while extracting insights. Compared with Logistic Regression (LR), Gaussian Naïve Bayes (GNB), and Random Forest (RF), the methodology integrated feature selection, statistical analysis, and SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) for interpretability. This approach facilitates transparent and interpretable AI decision-making, aligning with responsible AI development principles. Overall, it combines privacy preservation, interpretability, and ethical considerations for accurate CD predictions.
RESULTS: Our investigations from the DP framework with LR were promising, with an area under curve (AUC) of 0.848 ± 0.03, an accuracy of 0.797 ± 0.02, precision at 0.789 ± 0.02, recall at 0.797 ± 0.02, and an F1 score of 0.787 ± 0.02, with a comparable performance with the non-privacy framework. The SHAP and LIME based results support clinical findings, show a commitment to transparent and interpretable AI decision-making, and aligns with the principles of responsible AI development.
CONCLUSIONS: Our study endorses a novel approach in predicting CD, amalgamating data anonymization, privacy-preserving methods, interpretability tools SHAP, LIME, and ethical considerations. This responsible AI framework ensures accurate predictions, privacy preservation, and user trust, underscoring the significance of comprehensive and transparent ML models in healthcare. Therefore, this research empowers the ability to forecast CD, providing a vital lifeline to millions of CD patients globally and potentially preventing numerous fatalities.