interpretable artificial intelligence

可解释的人工智能
  • 文章类型: Journal Article
    目的:手术和辅助化疗后肌肉放射密度下降与卵巢癌的不良预后相关。由于需要具有一致的协议和劳动密集型过程的计算机断层扫描(CT),因此评估肌肉放射密度是现实世界的临床挑战。这项研究旨在使用可解释的机器学习(ML)来预测肌肉放射密度损失。
    方法:本研究包括2010年至2019年在两个三级中心接受原发性减瘤手术和铂类化疗的723例卵巢癌患者(队列1579例,队列2144例)。根据一致方案获得的治疗前和治疗后CT评估肌肉放射密度。放射性密度降低≥5%被定义为损失。训练了六个ML模型,并使用曲线下面积(AUC)和F1评分评估其表现。Shapley加法扩张(SHAP)方法用于解释ML模型。
    结果:CatBoost模型实现了0.871的最高AUC(95%置信区间,0.870-0.874)和0.688(95%置信区间,0.685-0.691)在训练集中的模型中,在外部验证集中表现优异,AUC为0.839,F1评分为0.673。白蛋白变化,腹水,和残留疾病是与较高的肌肉放射密度丢失可能性相关的最重要特征。SHAP力图提供了模型预测的个性化解释。
    结论:一个可解释的ML模型可以帮助临床医生识别卵巢癌患者在治疗后存在肌肉放射密度丢失的风险,并了解肌肉放射密度丢失的原因。
    OBJECTIVE: Muscle radiodensity loss after surgery and adjuvant chemotherapy is associated with poor outcomes in ovarian cancer. Assessing muscle radiodensity is a real-world clinical challenge owing to the requirement for computed tomography (CT) with consistent protocols and labor-intensive processes. This study aimed to use interpretable machine learning (ML) to predict muscle radiodensity loss.
    METHODS: This study included 723 patients with ovarian cancer who underwent primary debulking surgery and platinum-based chemotherapy between 2010 and 2019 at two tertiary centers (579 in cohort 1 and 144 in cohort 2). Muscle radiodensity was assessed from pre- and post-treatment CT acquired with consistent protocols, and a decrease in radiodensity ≥ 5% was defined as loss. Six ML models were trained, and their performances were evaluated using the area under the curve (AUC) and F1-score. The SHapley Additive exPlanations (SHAP) method was applied to interpret the ML models.
    RESULTS: The CatBoost model achieved the highest AUC of 0.871 (95% confidence interval, 0.870-0.874) and F1-score of 0.688 (95% confidence interval, 0.685-0.691) among the models in the training set and outperformed in the external validation set, with an AUC of 0.839 and F1-score of 0.673. Albumin change, ascites, and residual disease were the most important features associated with a higher likelihood of muscle radiodensity loss. The SHAP force plot provided an individualized interpretation of model predictions.
    CONCLUSIONS: An interpretable ML model can assist clinicians in identifying ovarian cancer patients at risk of muscle radiodensity loss after treatment and understanding the contributors of muscle radiodensity loss.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:机器学习技术已被证明在识别健康错误信息方面是有效的,但是结果可能是不可信的,除非它们能够以一种可以理解的方式被证明是合理的。
    目的:本研究旨在提供一种新的基于标准的系统来评估和证明健康新闻质量。使用现有标准集的子集,这项研究比较了两种增加可解释性的替代方法的可行性。两种方法都使用分类和突出显示来可视化句子级别的证据。
    方法:总共选择了10个完善的标准中的3个进行实验,即健康新闻是否讨论了干预的成本(成本标准),解释或量化干预的危害(危害标准),并确定了利益冲突(冲突标准)。实验的第一步是通过开发句子级分类器来自动评估3个标准。我们测试了Logistic回归,天真的贝叶斯,支持向量机,和随机森林算法。接下来,我们比较了两种可视化方法。对于第一种方法,我们计算了单词特征权重,它解释了分类模型如何提取有助于预测的关键词;然后,使用本地可解释的模型不可知的解释框架,我们在文档级别选择了与分类标准相关的关键字;最后,系统选择并突出显示带有关键字的句子。对于第二种方法,我们从100篇健康新闻中提取了提供支持评估结果的证据的句子;基于这些结果,我们在句子级别训练了一个类型学分类模型;然后,系统突出显示了一个积极的句子实例,用于结果证明。要突出显示的句子的数量由使用平均准确度凭经验确定的预设阈值确定。
    结果:健康新闻对成本的自动评估,伤害,和冲突标准的平均曲线下面积得分分别为0.88、0.76和0.73,经过50次重复的10倍交叉验证。我们发现两种方法都可以成功地可视化系统的解释,但是两种方法的性能因标准而异,并且随着突出显示的句子数量的增加,突出显示的准确性降低。当阈值精度≥75%时,这导致了一个可视化的可变长度范围从1到6个句子。
    结论:我们提供了2种方法来解释基于3个标准的健康新闻评估模型。该方法结合了基于规则和统计机器学习方法。结果表明,可以使用两种方法成功地从视觉上解释基于标准的自动健康新闻质量评估;但是,当考虑多个质量相关标准时,可能会出现更大的差异。这项研究可以增加公众对计算机化健康信息评估的信任。
    BACKGROUND: Machine learning techniques have been shown to be efficient in identifying health misinformation, but the results may not be trusted unless they can be justified in a way that is understandable.
    OBJECTIVE: This study aimed to provide a new criteria-based system to assess and justify health news quality. Using a subset of an existing set of criteria, this study compared the feasibility of 2 alternative methods for adding interpretability. Both methods used classification and highlighting to visualize sentence-level evidence.
    METHODS: A total of 3 out of 10 well-established criteria were chosen for experimentation, namely whether the health news discussed the costs of the intervention (the cost criterion), explained or quantified the harms of the intervention (the harm criterion), and identified the conflicts of interest (the conflict criterion). The first step of the experiment was to automate the evaluation of the 3 criteria by developing a sentence-level classifier. We tested Logistic Regression, Naive Bayes, Support Vector Machine, and Random Forest algorithms. Next, we compared the 2 visualization approaches. For the first approach, we calculated word feature weights, which explained how classification models distill keywords that contribute to the prediction; then, using the local interpretable model-agnostic explanation framework, we selected keywords associated with the classified criterion at the document level; and finally, the system selected and highlighted sentences with keywords. For the second approach, we extracted sentences that provided evidence to support the evaluation result from 100 health news articles; based on these results, we trained a typology classification model at the sentence level; and then, the system highlighted a positive sentence instance for the result justification. The number of sentences to highlight was determined by a preset threshold empirically determined using the average accuracy.
    RESULTS: The automatic evaluation of health news on the cost, harm, and conflict criteria achieved average area under the curve scores of 0.88, 0.76, and 0.73, respectively, after 50 repetitions of 10-fold cross-validation. We found that both approaches could successfully visualize the interpretation of the system but that the performance of the 2 approaches varied by criterion and highlighting the accuracy decreased as the number of highlighted sentences increased. When the threshold accuracy was ≥75%, this resulted in a visualization with a variable length ranging from 1 to 6 sentences.
    CONCLUSIONS: We provided 2 approaches to interpret criteria-based health news evaluation models tested on 3 criteria. This method incorporated rule-based and statistical machine learning approaches. The results suggested that one might visually interpret an automatic criterion-based health news quality evaluation successfully using either approach; however, larger differences may arise when multiple quality-related criteria are considered. This study can increase public trust in computerized health information evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    物联网(IoT)技术在医疗保健领域的日益整合彻底改变了医疗保健交付,实现先进的个性化护理和精确的治疗。然而,这提出了重大挑战,要求坚固,可理解的,和有效的监督机制。我们提出了一种可解释的机器学习方法,用于在医疗物联网领域中对行为异常进行可信和有效的检测。发现的异常可作为潜在系统故障和安全威胁的指标。本质上,异常的检测是通过从智能设备生成的操作数据中学习分类器来完成的。学习问题在预测关联建模中得到处理,其表现力和清晰度增强了可信度,提供了全面的,完全可解释,和医疗物联网生态系统的有效监控解决方案。初步结果表明了该方法的有效性。
    The growing integration of Internet of Things (IoT) technology within the healthcare sector has revolutionized healthcare delivery, enabling advanced personalized care and precise treatments. However, this raises significant challenges, demanding robust, intelligible, and effective monitoring mechanisms. We propose an interpretable machine-learning approach to the trustworthy and effective detection of behavioral anomalies within the realm of medical IoT. The discovered anomalies serve as indicators of potential system failures and security threats. Essentially, the detection of anomalies is accomplished by learning a classifier from the operational data generated by smart devices. The learning problem is dealt with in predictive association modeling, whose expressiveness and intelligibility enforce trustworthiness to offer a comprehensive, fully interpretable, and effective monitoring solution for the medical IoT ecosystem. Preliminary results show the effectiveness of our approach.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    成像流式细胞术(IFC)允许每秒快速采集大量单细胞图像,从多个荧光通道捕获信息。然而,用荧光标记的缀合抗体染色细胞进行IFC分析的传统过程耗时,贵,并可能对细胞活力有害。为了简化实验工作流程并降低成本,确定最相关的下游分析渠道至关重要。在这项研究中,我们介绍PXPermute,一种用户友好且功能强大的方法,用于评估IFC渠道的重要性,特别是细胞分析。我们的方法通过排列每个通道内的像素值并分析对机器学习或深度学习模型的影响来评估通道重要性。通过对三个多通道IFC图像数据集的严格评估,我们展示了PXPermute在准确识别信息最丰富的渠道方面的潜力,与已建立的生物学知识保持一致。PXPermute可以帮助生物学家进行系统的通道分析,实验设计优化,和生物标志物鉴定。
    Imaging flow cytometry (IFC) allows rapid acquisition of numerous single-cell images per second, capturing information from multiple fluorescent channels. However, the traditional process of staining cells with fluorescently labeled conjugated antibodies for IFC analysis is time consuming, expensive, and potentially harmful to cell viability. To streamline experimental workflows and reduce costs, it is crucial to identify the most relevant channels for downstream analysis. In this study, we introduce PXPermute, a user-friendly and powerful method for assessing the significance of IFC channels, particularly for cell profiling. Our approach evaluates channel importance by permuting pixel values within each channel and analyzing the resulting impact on machine learning or deep learning models. Through rigorous evaluation of three multichannel IFC image datasets, we demonstrate PXPermute\'s potential in accurately identifying the most informative channels, aligning with established biological knowledge. PXPermute can assist biologists with systematic channel analysis, experimental design optimization, and biomarker identification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    当前的应力检测方法集中于识别应力和非应力状态,尽管存在各种应力类型。本研究进行了更具体的,可解释的压力分类,这可以提供有关生理应激反应的有价值的信息。
    在马斯特里赫特急性应激测试(MAST)中测量了生理反应,包括冷加压(引起生理压力和疼痛)和心理算术(引起认知和社会评估压力)的交替试验。通过混合模型分析,将这些子任务中的响应相互比较,并与基线进行比较。随后,对影响分类的几个机器学习成分进行了综合分析。最后,可解释的人工智能(XAI)方法用于分析生理特征对模型行为的影响。
    大多数研究的生理反应都是针对应激源的,子任务可以与基线区分,平衡准确率高达86.5%。要测量的生理信号的选择(平衡精度高达25%的点差)和特征的选择(高达7%的点差)是分类中的两个关键组成部分。XAI分析对混合模型结果和人体生理的反映表明,压力检测模型集中在与两种压力源相关的生理特征上。
    研究结果证实,多模态机器学习分类可以从基线检测不同类型的应激反应,同时专注于生理上的合理变化。由于测量信号和特征选择对分类性能的影响最大,数据分析选择使有限的输入信息得不到补偿。
    UNASSIGNED: Current stress detection methods concentrate on identification of stress and non-stress states despite the existence of various stress types. The present study performs a more specific, explainable stress classification, which could provide valuable information on the physiological stress reactions.
    UNASSIGNED: Physiological responses were measured in the Maastricht Acute Stress Test (MAST), comprising alternating trials of cold pressor (inducing physiological stress and pain) and mental arithmetics (eliciting cognitive and social-evaluative stress). The responses in these subtasks were compared to each other and to the baseline through mixed model analysis. Subsequently, stress type detection was conducted with a comprehensive analysis of several machine learning components affecting classification. Finally, explainable artificial intelligence (XAI) methods were applied to analyze the influence of physiological features on model behavior.
    UNASSIGNED: Most of the investigated physiological reactions were specific to the stressors, and the subtasks could be distinguished from baseline with up to 86.5% balanced accuracy. The choice of the physiological signals to measure (up to 25%-point difference in balanced accuracy) and the selection of features (up to 7%-point difference) were the two key components in classification. Reflection of the XAI analysis to mixed model results and human physiology revealed that the stress detection model concentrated on physiological features relevant for the two stressors.
    UNASSIGNED: The findings confirm that multimodal machine learning classification can detect different types of stress reactions from baseline while focusing on physiologically sensible changes. Since the measured signals and feature selection affected classification performance the most, data analytic choices left limited input information uncompensated.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    研究人员越来越多地转向可解释的人工智能(XAI)来分析组学数据并深入了解潜在的生物过程。然而,鉴于该领域的跨学科性质,许多发现仅在各自的研究社区中共享。需要对XAI进行组学数据概述,以突出有前途的方法并帮助检测常见问题。为此,我们进行了系统的制图研究。为了确定相关文献,我们询问了Scopus,PubMed,WebofScience,BioRxiv,MedRxiv和arXiv.根据关键措辞,我们开发了一个关于人工智能方法研究的10个方面的编码方案,可解释性方法和组学数据。我们的制图研究导致了2010年至2023年之间发表的405篇论文。被检查的论文分析基于DNA的(主要是基因组的),转录组,通过神经网络的蛋白质组或代谢组数据,基于树的方法,统计方法和进一步的人工智能方法。优选的事后可解释性方法是特征相关性(n=166)和视觉解释(n=52),而使用可解释方法的论文通常诉诸于使用透明模型(n=83)或架构修改(n=72)。由于XAI在组学数据方面的许多研究差距仍然很明显,我们推导了八个研究方向,并讨论了它们在该领域的潜力。我们还为每个方向提供了示例性的研究问题。在临床实践中采用XAI进行组学数据的许多问题尚未解决。这项系统的制图研究概述了该主题的现有研究,并为研究人员和从业人员提供了研究方向。
    Researchers increasingly turn to explainable artificial intelligence (XAI) to analyze omics data and gain insights into the underlying biological processes. Yet, given the interdisciplinary nature of the field, many findings have only been shared in their respective research community. An overview of XAI for omics data is needed to highlight promising approaches and help detect common issues. Toward this end, we conducted a systematic mapping study. To identify relevant literature, we queried Scopus, PubMed, Web of Science, BioRxiv, MedRxiv and arXiv. Based on keywording, we developed a coding scheme with 10 facets regarding the studies\' AI methods, explainability methods and omics data. Our mapping study resulted in 405 included papers published between 2010 and 2023. The inspected papers analyze DNA-based (mostly genomic), transcriptomic, proteomic or metabolomic data by means of neural networks, tree-based methods, statistical methods and further AI methods. The preferred post-hoc explainability methods are feature relevance (n = 166) and visual explanation (n = 52), while papers using interpretable approaches often resort to the use of transparent models (n = 83) or architecture modifications (n = 72). With many research gaps still apparent for XAI for omics data, we deduced eight research directions and discuss their potential for the field. We also provide exemplary research questions for each direction. Many problems with the adoption of XAI for omics data in clinical practice are yet to be resolved. This systematic mapping study outlines extant research on the topic and provides research directions for researchers and practitioners.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    心血管疾病(CVD)和癌症是全球130多个国家的第一和第二大死亡原因。它们也是全球近180个国家的三大原因之一。心血管并发症经常在癌症患者中被注意到,近20%表现出心血管合并症。体育锻炼可能对癌症幸存者和癌症患者(PLWC)有帮助。因为它可以防止复发,CVD,和心脏毒性。因此,建议将运动作为心脏肿瘤预防护理的一部分是有益的。
    随着深度学习算法的进步和大数据处理技术的改进,人工智能(AI)在医药保健领域逐渐普及。在我国医疗资源短缺的背景下,采用人工智能和机器学习方法进行处方推荐具有重要意义。本研究旨在开发一种可解释的基于机器学习的运动处方智能系统,用于心血管肿瘤预防护理,本文介绍了研究方案。
    这将是一项采用介入方法的回顾性机器学习建模队列研究(即,运动处方)。我们将在基线(从2025年1月1日至2026年12月31日)招募PLWC参与者,并随访数年(从2027年1月1日至2028年12月31日)。具体来说,参与者将符合以下条件:(1)PLWCI期或I期癌症幸存者;(2)年龄在18至55岁之间;(3)对康复体育锻炼感兴趣;(4)愿意佩戴智能传感器/手表;(5)经医生评估为适合运动干预。在基线,由美国运动医学学院和中国运动医学协会联合培训计划(2023年1月1日至2024年12月31日)认证的临床运动生理学家将向每位参与者推荐运动处方.在后续行动中,有效的运动处方将通过评估参与者的心血管疾病状况来确定.
    这项研究不仅旨在开发一种可解释的机器学习模型来推荐运动处方,而且还旨在开发一种用于精确心脏肿瘤预防护理的运动处方智能系统。
    本研究经广州体育学院人体实验伦理检验批准。
    http://www.chictr.org.cn,标识符ChiCTR2300077887。
    UNASSIGNED: Cardiovascular disease (CVD) and cancer are the first and second causes of death in over 130 countries across the world. They are also among the top three causes in almost 180 countries worldwide. Cardiovascular complications are often noticed in cancer patients, with nearly 20% exhibiting cardiovascular comorbidities. Physical exercise may be helpful for cancer survivors and people living with cancer (PLWC), as it prevents relapses, CVD, and cardiotoxicity. Therefore, it is beneficial to recommend exercise as part of cardio-oncology preventive care.
    UNASSIGNED: With the progress of deep learning algorithms and the improvement of big data processing techniques, artificial intelligence (AI) has gradually become popular in the fields of medicine and healthcare. In the context of the shortage of medical resources in China, it is of great significance to adopt AI and machine learning methods for prescription recommendations. This study aims to develop an interpretable machine learning-based intelligent system of exercise prescription for cardio-oncology preventive care, and this paper presents the study protocol.
    UNASSIGNED: This will be a retrospective machine learning modeling cohort study with interventional methods (i.e., exercise prescription). We will recruit PLWC participants at baseline (from 1 January 2025 to 31 December 2026) and follow up over several years (from 1 January 2027 to 31 December 2028). Specifically, participants will be eligible if they are (1) PLWC in Stage I or cancer survivors from Stage I; (2) aged between 18 and 55 years; (3) interested in physical exercise for rehabilitation; (4) willing to wear smart sensors/watches; (5) assessed by doctors as suitable for exercise interventions. At baseline, clinical exercise physiologist certificated by the joint training program (from 1 January 2023 to 31 December 2024) of American College of Sports Medicine and Chinese Association of Sports Medicine will recommend exercise prescription to each participant. During the follow-up, effective exercise prescription will be determined by assessing the CVD status of the participants.
    UNASSIGNED: This study aims to develop not only an interpretable machine learning model to recommend exercise prescription but also an intelligent system of exercise prescription for precision cardio-oncology preventive care.
    UNASSIGNED: This study is approved by Human Experimental Ethics Inspection of Guangzhou Sport University.
    UNASSIGNED: http://www.chictr.org.cn, identifier ChiCTR2300077887.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    伤害,住院治疗,甚至死亡是老年人跌倒的常见后果。因此,从预防的角度来看,早期和可靠地识别有反复跌倒风险的人至关重要。这项研究旨在通过使用脚踝安装的IMU传感器提供的数据来评估可解释的半监督方法在识别有跌倒风险的个人中的有效性。我们的方法受益于跌倒事件与平衡能力之间的因果关系,以查明跌倒概率最高的时刻。该框架还具有在未标记数据上进行训练的优点,并且可以利用其解释能力来检测目标,同时仅使用患者元数据,尤其是那些与平衡特征有关的。这项研究表明,基于视觉的自我注意力模型能够通过将高重量值归因于IMU传感器的垂直加速度分量超过5m/s²的时刻来推断跌倒事件与失平衡之间的关系。这种半监督方法使用可解释的功能来突出记录的时刻,这些时刻可以解释平衡的得分,从而揭示了跌倒风险最高的时刻。与基于阈值的方法相比,我们的模型可以在1s(目标前后500ms)的窗口中检测到71%的可能的跌倒风险事件。在使用可穿戴设备时预防跌倒的情况下,这种类型的框架在降低注释成本方面起着至关重要的作用。总的来说,这种自适应工具可以为医疗保健专业人员提供有价值的数据,它可以帮助他们以更低的成本更大规模地加强防坠工作。
    Injury, hospitalization, and even death are common consequences of falling for elderly people. Therefore, early and robust identification of people at risk of recurrent falling is crucial from a preventive point of view. This study aims to evaluate the effectiveness of an interpretable semi-supervised approach in identifying individuals at risk of falls by using the data provided by ankle-mounted IMU sensors. Our method benefits from the cause-effect link between a fall event and balance ability to pinpoint the moments with the highest fall probability. This framework also has the advantage of training on unlabeled data, and one can exploit its interpretation capacities to detect the target while only using patient metadata, especially those in relation to balance characteristics. This study shows that a visual-based self-attention model is able to infer the relationship between a fall event and loss of balance by attributing high values of weight to moments where the vertical acceleration component of the IMU sensors exceeds 5 m/s² during an especially short period. This semi-supervised approach uses interpretable features to highlight the moments of the recording that may explain the score of balance, thus revealing the moments with the highest risk of falling. Our model allows for the detection of 71% of the possible falling risk events in a window of 1 s (500 ms before and after the target) when compared with threshold-based approaches. This type of framework plays a paramount role in reducing the costs of annotation in the case of fall prevention when using wearable devices. Overall, this adaptive tool can provide valuable data to healthcare professionals, and it can assist them in enhancing fall prevention efforts on a larger scale with lower costs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:丙型肝炎是一种普遍存在的疾病,对人类肝脏构成高风险。丙型肝炎的早期诊断对于治疗和预后至关重要。因此,开发有效的医疗决策系统至关重要。近年来,已经提出了许多计算方法来识别丙型肝炎患者。尽管现有的肝炎预测模型在准确性方面取得了良好的效果,其中大多数是黑箱模型,在临床实践中无法获得医生和患者的信任。因此,这项研究旨在使用各种机器学习(ML)模型来预测患者是否患有丙型肝炎,同时还使用可解释的模型来阐明ML模型的预测过程,从而使预测过程更加透明。
    结果:我们进行了一项基于血清学检测的丙型肝炎预测研究,并为预测过程提供了全面的解释。在整个实验过程中,我们对基准数据集进行了建模,并使用五次交叉验证和独立测试实验评估模型性能。在评估了三种类型的黑匣子机器学习模型之后,随机森林(RF),支持向量机(SVM)和AdaBoost,采用贝叶斯优化射频作为分类算法。在模型解释方面,除了使用常见的SHapley加法扩张(SHAP)为该模型提供全球解释之外,我们还利用了局部可解释模型-具有稳定性的不可知解释(LIME_stabilitly)来提供模型的局部解释.
    结论:五次交叉验证和独立测试表明,我们提出的方法显着优于最先进的方法。IHCP保持了出色的模型可解释性,同时获得了出色的预测性能。这有助于揭示模型的潜在预测模式,并使临床医生能够更好地了解模型的决策过程。
    BACKGROUND: Hepatitis C is a prevalent disease that poses a high risk to the human liver. Early diagnosis of hepatitis C is crucial for treatment and prognosis. Therefore, developing an effective medical decision system is essential. In recent years, many computational methods have been proposed to identify hepatitis C patients. Although existing hepatitis prediction models have achieved good results in terms of accuracy, most of them are black-box models and cannot gain the trust of doctors and patients in clinical practice. As a result, this study aims to use various Machine Learning (ML) models to predict whether a patient has hepatitis C, while also using explainable models to elucidate the prediction process of the ML models, thus making the prediction process more transparent.
    RESULTS: We conducted a study on the prediction of hepatitis C based on serological testing and provided comprehensive explanations for the prediction process. Throughout the experiment, we modeled the benchmark dataset, and evaluated model performance using fivefold cross-validation and independent testing experiments. After evaluating three types of black-box machine learning models, Random Forest (RF), Support Vector Machine (SVM), and AdaBoost, we adopted Bayesian-optimized RF as the classification algorithm. In terms of model interpretation, in addition to using common SHapley Additive exPlanations (SHAP) to provide global explanations for the model, we also utilized the Local Interpretable Model-Agnostic Explanations with stability (LIME_stabilitly) to provide local explanations for the model.
    CONCLUSIONS: Both the fivefold cross-validation and independent testing show that our proposed method significantly outperforms the state-of-the-art method. IHCP maintains excellent model interpretability while obtaining excellent predictive performance. This helps uncover potential predictive patterns of the model and enables clinicians to better understand the model\'s decision-making process.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:处于恢复期的COVID-19患者明显存在肺弥散功能损害(PDCI)。肺弥散量是COVID-19幸存者肺功能预后的常用指标,但目前集中于预测这些人的肺弥散能力的研究是有限的。这项研究的目的是使用常规可用的临床数据开发和验证用于预测COVID-19患者PDCI的机器学习(ML)模型。从而辅助临床诊断。
    方法:从2021年8月至9月的一项随访研究中收集了从武汉出院后18个月的221名COVID-19住院幸存者,包括人口统计学特征和临床检查,本研究中的数据被随机分为训练(80%)数据集和验证(20%)数据集.开发了六种流行的机器学习模型来预测感染COVID-19的患者在恢复期的肺弥散能力。模型的性能指标包括曲线下面积(AUC),准确性,回想一下,Precision,正预测值(PPV),负预测值(NPV)和F1。将性能最优的模型定义为最优模型,这在可解释性分析中得到了进一步的应用。利用MAHAKIL法对数据进行平衡,优化样本分布的平衡,而特征选择的RFECV方法用于选择更有利于机器学习的组合特征。
    结果:本研究共招募了221名从武汉医院出院后的COVID-19幸存者。在这些参与者中,117(52.94%)为女性,年龄中位数为58.2岁(标准差(SD)=12)。选择功能后,最终选择37个临床因素中的31个用于构建模型。在六个测试的ML模型中,在XGBoost模型中实现了最佳性能,经实验验证,AUC为0.755,准确率为78.01%。形状相加解释(SHAP)汇总分析显示血红蛋白(Hb),最大自主通气(MVV),疾病的严重程度,血小板(PLT),尿酸(UA)和血尿素氮(BUN)是影响XGBoost模型决策的前六个最重要因素。
    结论:本文报道的XGBoost模型对COVID-19幸存者在恢复期的PDCI具有良好的预后预测能力。在基于SHAP值重要性的解释方法中,Hb和MVV对COVID-19幸存者恢复期PDCI结局的预测贡献最大。
    The COVID-19 patients in the convalescent stage noticeably have pulmonary diffusing capacity impairment (PDCI). The pulmonary diffusing capacity is a frequently-used indicator of the COVID-19 survivors\' prognosis of pulmonary function, but the current studies focusing on prediction of the pulmonary diffusing capacity of these people are limited. The aim of this study was to develop and validate a machine learning (ML) model for predicting PDCI in the COVID-19 patients using routinely available clinical data, thus assisting the clinical diagnosis.
    Collected from a follow-up study from August to September 2021 of 221 hospitalized survivors of COVID-19 18 months after discharge from Wuhan, including the demographic characteristics and clinical examination, the data in this study were randomly separated into a training (80%) data set and a validation (20%) data set. Six popular machine learning models were developed to predict the pulmonary diffusing capacity of patients infected with COVID-19 in the recovery stage. The performance indicators of the model included area under the curve (AUC), Accuracy, Recall, Precision, Positive Predictive Value(PPV), Negative Predictive Value (NPV) and F1. The model with the optimum performance was defined as the optimal model, which was further employed in the interpretability analysis. The MAHAKIL method was utilized to balance the data and optimize the balance of sample distribution, while the RFECV method for feature selection was utilized to select combined features more favorable to machine learning.
    A total of 221 COVID-19 survivors were recruited in this study after discharge from hospitals in Wuhan. Of these participants, 117 (52.94%) were female, with a median age of 58.2 years (standard deviation (SD) = 12). After feature selection, 31 of the 37 clinical factors were finally selected for use in constructing the model. Among the six tested ML models, the best performance was accomplished in the XGBoost model, with an AUC of 0.755 and an accuracy of 78.01% after experimental verification. The SHAPELY Additive explanations (SHAP) summary analysis exhibited that hemoglobin (Hb), maximal voluntary ventilation (MVV), severity of illness, platelet (PLT), Uric Acid (UA) and blood urea nitrogen (BUN) were the top six most important factors affecting the XGBoost model decision-making.
    The XGBoost model reported here showed a good prognostic prediction ability for PDCI of COVID-19 survivors during the recovery period. Among the interpretation methods based on the importance of SHAP values, Hb and MVV contributed the most to the prediction of PDCI outcomes of COVID-19 survivors in the recovery period.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号