Explainability

可解释性
  • 文章类型: Journal Article
    产前酒精暴露(PAE)是指由于怀孕期间饮酒而暴露于发育中的胎儿,并可能对学习产生终身影响,行为,和健康。了解PAE对发育中的大脑的影响由于其复杂的结构和功能属性而表现出挑战。这可以通过利用机器学习(ML)和深度学习(DL)方法来解决。虽然大多数ML和DL模型都是针对以成人为中心的问题量身定制的,这项工作的重点是应用DL检测儿科人群中的PAE.这项研究整合了预先训练的简单全卷积网络(SFCN)作为一种用于提取特征的迁移学习方法,以及一种新训练的分类器,用于根据2-8岁个体的T1加权结构脑磁共振(MR)扫描来区分未暴露和PAE参与者。在训练过程中几个不同的数据集大小和增强策略中,当考虑对两个类别都有增强的平衡数据集时,分类器在测试数据上获得了88.47%的最高灵敏度和85.04%的平均准确度.此外,我们还使用Grad-CAM方法初步进行了可解释性分析,突出大脑的各个区域,如call体,小脑,pons,白质是模型决策过程中最重要的特征。尽管由于大脑的快速发展,为儿科人群构建DL模型面临挑战,运动伪影,数据不足,这项工作突出了迁移学习在数据有限的情况下的潜力。此外,这项研究强调了保持平衡的数据集对公平分类的重要性,并阐明了使用可解释性分析进行模型预测的基本原理。
    Prenatal alcohol exposure (PAE) refers to the exposure of the developing fetus due to alcohol consumption during pregnancy and can have life-long consequences for learning, behavior, and health. Understanding the impact of PAE on the developing brain manifests challenges due to its complex structural and functional attributes, which can be addressed by leveraging machine learning (ML) and deep learning (DL) approaches. While most ML and DL models have been tailored for adult-centric problems, this work focuses on applying DL to detect PAE in the pediatric population. This study integrates the pre-trained simple fully convolutional network (SFCN) as a transfer learning approach for extracting features and a newly trained classifier to distinguish between unexposed and PAE participants based on T1-weighted structural brain magnetic resonance (MR) scans of individuals aged 2-8 years. Among several varying dataset sizes and augmentation strategy during training, the classifier secured the highest sensitivity of 88.47% with 85.04% average accuracy on testing data when considering a balanced dataset with augmentation for both classes. Moreover, we also preliminarily performed explainability analysis using the Grad-CAM method, highlighting various brain regions such as corpus callosum, cerebellum, pons, and white matter as the most important features in the model\'s decision-making process. Despite the challenges of constructing DL models for pediatric populations due to the brain\'s rapid development, motion artifacts, and insufficient data, this work highlights the potential of transfer learning in situations where data is limited. Furthermore, this study underscores the importance of preserving a balanced dataset for fair classification and clarifying the rationale behind the model\'s prediction using explainability analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    社交平台的用户通常将这些网站视为发布心理健康问题的支持空间。这些对话包含有关个人健康风险的重要痕迹。最近,研究人员利用这些在线信息来构建心理健康检测模型,旨在识别Twitter等平台上面临风险的用户,Reddit或Facebook。这些模型中的大多数都专注于实现良好的分类结果,忽略了决策的可解释性和可解释性。最近的研究指出了使用临床标志物的重要性,如使用症状,提高卫生专业人员对计算模型的信任。在本文中,我们引入了基于变压器的体系结构,旨在检测和解释社交媒体中用户生成内容中抑郁症状标记的出现。我们提出了两种方法:(I)训练模型进行分类,另一个用于分别解释分类器的决策,并且(ii)在单个模型中同时统一两个任务。此外,对于后一种方式,我们还利用上下文学习和微调研究了最近的会话大语言模型(LLM)的性能。我们的模型提供自然语言解释,符合验证的症状,从而使临床医生能够更有效地解释决策。我们使用最近以症状为中心的数据集评估我们的方法,使用离线指标和专家在环评估来评估我们的模型解释的质量。我们的发现表明,在产生可解释的基于症状的解释的同时,有可能获得良好的分类结果。
    Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals\' health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these models are focused on achieving good classification results, ignoring the explainability and interpretability of the decisions. Recent research has pointed out the importance of using clinical markers, such as the use of symptoms, to improve trust in the computational models by health professionals. In this paper, we introduce transformer-based architectures designed to detect and explain the appearance of depressive symptom markers in user-generated content from social media. We present two approaches: (i) train a model to classify, and another one to explain the classifier\'s decision separately and (ii) unify the two tasks simultaneously within a single model. Additionally, for this latter manner, we also investigated the performance of recent conversational Large Language Models (LLMs) utilizing both in-context learning and finetuning. Our models provide natural language explanations, aligning with validated symptoms, thus enabling clinicians to interpret the decisions more effectively. We evaluate our approaches using recent symptom-focused datasets, using both offline metrics and expert-in-the-loop evaluations to assess the quality of our models\' explanations. Our findings demonstrate that it is possible to achieve good classification results while generating interpretable symptom-based explanations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    可靠地检测自动诊断辅助系统中潜在的误导性模式,比如那些由人工智能(AI)驱动的,对于灌输用户信任和确保可靠性至关重要。当前的技术在可视化这些混杂因素方面不足。我们提议DiffChest,在来自美国和欧洲的194,956名患者的515,704张胸片上训练的自条件扩散模型。DiffChest提供特定于患者的解释,并可视化可能误导模型的混杂因素。读者间的高协议,Fleiss\'kappa值为0.8或更高,验证了其识别治疗相关混杂因素的能力。准确检测到混杂因素,患病率为10%-100%。预训练过程优化相关成像信息的模型,导致11个胸部状况的出色诊断准确性,包括胸腔积液和心功能不全。我们的发现强调了扩散模型在医学图像分类中的潜力,提供对混杂因素的洞察,增强模型的鲁棒性和可靠性。
    Reliably detecting potentially misleading patterns in automated diagnostic assistance systems, such as those powered by artificial intelligence (AI), is crucial for instilling user trust and ensuring reliability. Current techniques fall short in visualizing such confounding factors. We propose DiffChest, a self-conditioned diffusion model trained on 515,704 chest radiographs from 194,956 patients across the US and Europe. DiffChest provides patient-specific explanations and visualizes confounding factors that might mislead the model. The high inter-reader agreement, with Fleiss\' kappa values of 0.8 or higher, validates its capability to identify treatment-related confounders. Confounders are accurately detected with 10%-100% prevalence rates. The pretraining process optimizes the model for relevant imaging information, resulting in excellent diagnostic accuracy for 11 chest conditions, including pleural effusion and heart insufficiency. Our findings highlight the potential of diffusion models in medical image classification, providing insights into confounding factors and enhancing model robustness and reliability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    克罗恩病(CD)是一种不明原因的慢性炎症性肠病,其进展可引起严重的残疾和发病。由于CD的独特性质,许多患者在其一生中经常需要手术,术后并发症发生率高,会影响患者的预后。因此,识别和处理术后并发症至关重要.机器学习(ML)在医学领域变得越来越重要,基于ML的模型可用于预测CD肠切除术的术后并发症。最近,Wang等人发表了一篇有价值的文章,题为“预测克罗恩病肠切除术后的短期主要并发症:一项基于机器学习的研究”。我们欣赏作者的创造性工作,我们愿意分享我们的观点,并与作者讨论。
    Crohn\'s disease (CD) is a chronic inflammatory bowel disease of unknown origin that can cause significant disability and morbidity with its progression. Due to the unique nature of CD, surgery is often necessary for many patients during their lifetime, and the incidence of postoperative complications is high, which can affect the prognosis of patients. Therefore, it is essential to identify and manage postoperative complications. Machine learning (ML) has become increasingly important in the medical field, and ML-based models can be used to predict postoperative complications of intestinal resection for CD. Recently, a valuable article titled \"Predicting short-term major postoperative complications in intestinal resection for Crohn\'s disease: A machine learning-based study\" was published by Wang et al. We appreciate the authors\' creative work, and we are willing to share our views and discuss them with the authors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    自编码器是机器学习领域中的降维模型,可以将其视为主成分分析(PCA)的神经网络对应物。由于它们的灵活性和良好的性能,自编码器最近已用于估计金融中的非线性因素模型。自动编码器的主要弱点是结果比PCA获得的结果更不可解释。在本文中,我们建议在非线性因素模型的背景下采用Shapley值来提高自动编码器的可解释性。特别是,我们使用基于预测的Shapley值方法测量非线性潜在因素的相关性,该方法测量每个潜在因素在确定因素增强模型中的样本外准确性方面的贡献。考虑到商品市场有趣的经验实例,我们根据每种商品的样本外预测能力确定最相关的潜在因素。
    Autoencoders are dimension reduction models in the field of machine learning which can be thought of as a neural network counterpart of principal components analysis (PCA). Due to their flexibility and good performance, autoencoders have been recently used for estimating nonlinear factor models in finance. The main weakness of autoencoders is that the results are less explainable than those obtained with the PCA. In this paper, we propose the adoption of the Shapley value to improve the explainability of autoencoders in the context of nonlinear factor models. In particular, we measure the relevance of nonlinear latent factors using a forecast-based Shapley value approach that measures each latent factor\'s contributions in determining the out-of-sample accuracy in factor-augmented models. Considering the interesting empirical instance of the commodity market, we identify the most relevant latent factors for each commodity based on their out-of-sample forecasting ability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究的目的是开发可解释的AI模型来预测心血管疾病。使用XGBoost算法,然后是规则提取和提供可解释性的论证理论,在低置信度结果或困境的场景中的可解释性和准确性。我们的发现与先前利用XGBoost机器学习算法预测心血管风险的研究一致。然而,它得到了基于规则的可解释性的支持,在提供全球和本地可解释性方面提供显著优势。需要进一步的工作来增强基于论证的规则可解释性,在低置信度结果或困境的场景中的可解释性和准确性。
    The objective of this study was to develop explainable AI modeling in the prediction of cardiovascular disease. The XGBoost algorithm was used followed by rule extraction and argumentation theory that provides interpretability, explainability and accuracy in scenarios with low confidence results or dilemmas. Our findings are in agreement with previous research utilizing the XGBoost machine learning algorithm for prediction of cardiovascular risk, however it is supported by rule based explainability, offering significant advantages in terms of providing both global and local explainability. Further work is needed to enhance the argumentation-based rule interpretability, explainability and accuracy in scenarios with low confidence results or dilemmas.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近年来,人工智能(AI)已经在日常生活的许多领域获得了势头。在医疗保健方面,AI可用于诊断或预测疾病。然而,需要可解释的AI(XAI)来确保用户了解算法如何做出决定。在我们的研究项目中,机器学习方法用于住院菌血症(HOB)的个体风险预测。本文提出了以用户为中心的XAI用于HOB风险预测的逐步实施和评估过程的愿景。最初的需求分析揭示了用户对使用和信任此类风险预测应用程序的可解释性需求的初步见解。然后,研究结果被用来提出逐步的过程,以用户为中心的评估。
    In recent years, artificial intelligence (AI) has gained momentum in many fields of daily live. In healthcare, AI can be used for diagnosing or predicting illnesses. However, explainable AI (XAI) is needed to ensure that users understand how the algorithm arrives at a decision. In our research project, machine learning methods are used for individual risk prediction of hospital-onset bacteremia (HOB). This paper presents a vision on a step-wise process for implementation and evaluation of user-centered XAI for risk prediction of HOB. An initial requirement analysis revealed first insights on the users\' needs of explainability to use and trust such risk prediction applications. The findings were then used to propose step-wise process towards a user-centered evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    特征归因方法是解释卷积神经网络做出的决策的一种流行方法。鉴于它们作为本地可解释工具的性质,这些方法不足以提供对其全球意义的系统评估。这种限制通常会导致确认偏差,在那里解释是在事实之后精心制作的。因此,我们对心电图时间序列领域的特征归因方法进行了系统的研究,专注于R峰,T波,和P波。使用模拟数据集,修改仅限于R峰和T波,我们评估了两种CNN架构和可解释性框架中各种特征归因技术的性能。将我们的分析扩展到现实世界的数据显示,虽然特征归因图有效地突出了重要的区域,他们缺乏清晰度,即使在模拟的理想条件下,导致模糊的陈述。
    Feature attribution methods stand as a popular approach for explaining the decisions made by convolutional neural networks. Given their nature as local explainability tools, these methods fall short in providing a systematic evaluation of their global meaningfulness. This limitation often gives rise to confirmation bias, where explanations are crafted after the fact. Consequently, we conducted a systematic investigation of feature attribution methods within the realm of electrocardiogram time series, focusing on R-peak, T-wave, and P-wave. Using a simulated dataset with modifications limited to the R-peak and T-wave, we evaluated the performance of various feature attribution techniques across two CNN architectures and explainability frameworks. Extending our analysis to real-world data revealed that, while feature attribution maps effectively highlight significant regions, their clarity is lacking, even under the simulated ideal conditions, resulting in blurry representations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在分类任务的背景下,对时间序列和表格数据进行了广泛的研究,考虑到它们不同的数据域。虽然特征提取可以将系列转换为表格数据,这些数据类型之间的直接比较仍然很少。尤其是在医疗数据领域,如心电图(ECG),深度学习由于缺乏简单快速的可解释性和可解释性而面临挑战。然而,这些是在该领域广泛可靠采用的关键方面。在我们的研究中,我们分别评估XGBoost和InceptionTime在ECG特征和时间序列数据上的表现.我们的发现表明,从ECG信号中提取的特征不仅具有竞争力,而且在训练和推理过程中还保留了优势。这些优点包括准确性,资源效率,稳定性,和高水平的可解释性。
    Extensive research has been conducted on time series and tabular data in the context of classification tasks, considering their distinct data domains. While feature extraction enables the transformation of series into tabular data, direct comparative comparisons between these data types remain scarce. Especially in the domain of medical data, such as electrocardiograms (ECGs), deep learning faces challenges due to its lack of easy and fast interpretability and explainability. However, these are crucial aspects for a wide and reliable adoption in the field. In our study, we assess the performance of XGBoost and InceptionTime on ECG features and time series data respectively. Our findings reveal that features extracted from ECG signals not only achieve competitive performance but also retain advantages during training and inference. These advantages encompass accuracy, resource efficiency, stability, and a high level of explainability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号