XAI XAI-医云文献数字医云科研云海量医学决策数据服务

XAI 关注

XAI

文献(52篇)

百科

视频

1 A review of evaluation approaches for explainable AI with applications in cardiology.

可解释人工智能在心脏病学中的应用评估方法综述。影响指数 : 9.588
发表时间：2024
来源期刊：Artif Intell Rev PMID：39132011

DOI：10.1007/s10462-024-10852-w
文章类型： Journal Article

可解释的人工智能（XAI）阐明了复杂AI模型的决策过程，对于在模型预测中建立信任非常重要。XAI解释本身需要评估准确性和合理性，并在使用基础AI模型的背景下进行评估。这篇综述详细介绍了XAI在心脏AI应用中的评估，并发现，在检查的研究中，37%的人使用文献结果评价XAI质量，11%的人使用临床医生作为领域专家，11%使用代理或统计分析，其余43%的人根本没有评估使用的XAI。我们的目标是激发医疗保健领域的额外研究，敦促研究人员不仅应用XAI方法，而且系统地评估由此产生的解释，作为朝着开发值得信赖和安全的模型迈出的一步。
■在线版本包含补充材料，可在10.1007/s10462-024-10852-w获得。
Explainable artificial intelligence (XAI) elucidates the decision-making process of complex AI models and is important in building trust in model predictions. XAI explanations themselves require evaluation as to accuracy and reasonableness and in the context of use of the underlying AI model. This review details the evaluation of XAI in cardiac AI applications and has found that, of the studies examined, 37% evaluated XAI quality using literature results, 11% used clinicians as domain-experts, 11% used proxies or statistical analysis, with the remaining 43% not assessing the XAI used at all. We aim to inspire additional studies within healthcare, urging researchers not only to apply XAI methods but to systematically assess the resulting explanations, as a step towards developing trustworthy and safe models.
UNASSIGNED: The online version contains supplementary material available at 10.1007/s10462-024-10852-w.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
2 A BERT-GNN Approach for Metastatic Breast Cancer Prediction Using Histopathology Reports.

使用组织病理学报告的 BERT - GNN 方法预测转移性乳腺癌。影响指数 : 3.992
发表时间：Jun 2024 27
来源期刊：Diagnostics (Basel) PMID：39001255

DOI：10.3390/diagnostics14131365
文章类型： Journal Article

转移性乳腺癌（MBC）仍然是女性癌症相关死亡的主要原因。这项工作介绍了一种创新的非侵入性乳腺癌分类模型，旨在改善癌症转移的识别。虽然这项研究标志着预测MBC的初步探索，额外的调查对于验证MBC的发生至关重要.我们的方法结合了大型语言模型(LLM)的优势，特别是来自变压器(BERT)模型的双向编码器表示，图神经网络（GNN）的强大功能，可根据组织病理学报告预测MBC患者。本文介绍了一种用于转移性乳腺癌预测（BG-MBC）的BERT-GNN方法，该方法集成了从BERT模型得出的图形信息。在这个模型中,节点是根据病人的医疗记录构建的，虽然BERT嵌入被用来对组织病理学报告中的单词进行矢量化表示，从而通过采用三种不同的方法(即单变量选择，用于特征重要性的额外树分类器，和Shapley值，以确定影响最显著的特征)。确定在模型训练期间作为嵌入生成的676个中最关键的30个特征，我们的模型进一步增强了其预测能力。BG-MBC模型具有出色的准确性，在识别MBC患者时，检出率为0.98，曲线下面积（AUC）为0.98。这种显著的表现归功于模型对LLM从组织病理学报告中产生的注意力得分的利用，有效地捕获相关特征进行分类。
Metastatic breast cancer (MBC) continues to be a leading cause of cancer-related deaths among women. This work introduces an innovative non-invasive breast cancer classification model designed to improve the identification of cancer metastases. While this study marks the initial exploration into predicting MBC, additional investigations are essential to validate the occurrence of MBC. Our approach combines the strengths of large language models (LLMs), specifically the bidirectional encoder representations from transformers (BERT) model, with the powerful capabilities of graph neural networks (GNNs) to predict MBC patients based on their histopathology reports. This paper introduces a BERT-GNN approach for metastatic breast cancer prediction (BG-MBC) that integrates graph information derived from the BERT model. In this model, nodes are constructed from patient medical records, while BERT embeddings are employed to vectorise representations of the words in histopathology reports, thereby capturing semantic information crucial for classification by employing three distinct approaches (namely univariate selection, extra trees classifier for feature importance, and Shapley values to identify the features that have the most significant impact). Identifying the most crucial 30 features out of 676 generated as embeddings during model training, our model further enhances its predictive capabilities. The BG-MBC model achieves outstanding accuracy, with a detection rate of 0.98 and an area under curve (AUC) of 0.98, in identifying MBC patients. This remarkable performance is credited to the model\'s utilisation of attention scores generated by the LLM from histopathology reports, effectively capturing pertinent features for classification.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
3 A tree-based explainable AI model for early detection of Covid-19 using physiological data.

基于树的可解释 AI 模型，用于使用生理数据早期检测新冠肺炎。影响指数 : 3.298
发表时间：Jun 2024 24
来源期刊：BMC Med Inform Decis Mak PMID：38915001

DOI：10.1186/s12911-024-02576-2
文章类型： Journal Article

随着2020年COVID-19的爆发，世界各国面临着重大的担忧和挑战。利用人工智能（AI）和数据科学技术进行疾病检测的各种研究已经出现。尽管COVID-19病例有所下降，世界各地仍然有病例和死亡。因此,在症状出现之前早期检测COVID-19对于减少其广泛影响至关重要。幸运的是,智能手表等可穿戴设备已被证明是有价值的生理数据来源，包括心率（HR）和睡眠质量，能够检测炎症性疾病。在这项研究中,我们利用已经存在的数据集,包括个体步数和心率数据,预测症状出现前COVID-19感染的概率.我们训练三个主要的模型架构：梯度提升分类器(GB)、CatBoost树，和TabNet分类器来分析生理数据并比较它们各自的表现。我们还在我们表现最好的模型中添加了一个可解释性层，这澄清了预测结果，并允许对有效性进行详细评估。此外，我们通过从Fitbit设备收集生理数据来创建私有数据集,以保证可靠性并避免偏差.然后使用相同的预训练模型将相同的模型集应用于该私有数据集，并记录了结果。使用基于CatBoost树的方法，我们表现最好的模型在公开数据集上的准确率为85%,优于以往的研究.此外，当应用于私有数据集时，这个相同的预训练CatBoost模型产生了81%的准确率。您可以在链接中找到源代码：https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data。git.
With the outbreak of COVID-19 in 2020, countries worldwide faced significant concerns and challenges. Various studies have emerged utilizing Artificial Intelligence (AI) and Data Science techniques for disease detection. Although COVID-19 cases have declined, there are still cases and deaths around the world. Therefore, early detection of COVID-19 before the onset of symptoms has become crucial in reducing its extensive impact. Fortunately, wearable devices such as smartwatches have proven to be valuable sources of physiological data, including Heart Rate (HR) and sleep quality, enabling the detection of inflammatory diseases. In this study, we utilize an already-existing dataset that includes individual step counts and heart rate data to predict the probability of COVID-19 infection before the onset of symptoms. We train three main model architectures: the Gradient Boosting classifier (GB), CatBoost trees, and TabNet classifier to analyze the physiological data and compare their respective performances. We also add an interpretability layer to our best-performing model, which clarifies prediction results and allows a detailed assessment of effectiveness. Moreover, we created a private dataset by gathering physiological data from Fitbit devices to guarantee reliability and avoid bias.The identical set of models was then applied to this private dataset using the same pre-trained models, and the results were documented. Using the CatBoost tree-based method, our best-performing model outperformed previous studies with an accuracy rate of 85% on the publicly available dataset. Furthermore, this identical pre-trained CatBoost model produced an accuracy of 81% when applied to the private dataset. You will find the source code in the link: https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data.git .

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
4 Impact of Responsible AI on the Occurrence and Resolution of Ethical Issues: Protocol for a Scoping Review.

负责任的人工智能对道德问题的发生和解决的影响：范围审查协议。影响指数 : 暂无
发表时间：Jun 2024 5
来源期刊：JMIR Res Protoc PMID：38838329

DOI：10.2196/52349
文章类型： Journal Article

背景：负责任的人工智能（RAI）强调使用实施问责制的道德框架，责任,和透明度，以解决部署和使用人工智能（AI）技术的问题，包括隐私,自主性,自决,偏见，和透明度。鉴于这些考虑，正在制定标准来指导人工智能的支持和实施。
目的：本综述的目的是提供有关RAI原则的实施以及AI系统中道德问题的发生和解决的当前研究证据和知识差距的概述。
方法：提出了针对系统评论的首选报告项目和针对范围审查的Meta分析扩展（PRISMA-ScR）指南的范围审查。PubMed,ERIC,Scopus,IEEEXplore,EBSCO,WebofScience,ACM数字图书馆,将系统地搜索自2013年以来发表的文章，以研究AI中的RAI原则和道德问题。资格评估将独立进行，编码数据将按照主题进行分析，并跨特定学科的文献进行分层。
结果：结果将包含在完整的范围审查中，预计将于2024年6月开始，并在2024年底前完成出版物的提交。
结论：本范围审查将总结证据的状态，并提供其影响的概述，以及优势，弱点,以及实施RAI原则的研究差距。审查还可能揭示特定学科的担忧，优先事项,并提出了解决这些问题的方法。因此，它将确定应成为未来可用监管选择重点的优先领域，将原则的伦理要求的理论方面与实际解决方案联系起来。
■PRR1-10.2196/52349。
BACKGROUND: Responsible artificial intelligence (RAI) emphasizes the use of ethical frameworks implementing accountability, responsibility, and transparency to address concerns in the deployment and use of artificial intelligence (AI) technologies, including privacy, autonomy, self-determination, bias, and transparency. Standards are under development to guide the support and implementation of AI given these considerations.
OBJECTIVE: The purpose of this review is to provide an overview of current research evidence and knowledge gaps regarding the implementation of RAI principles and the occurrence and resolution of ethical issues within AI systems.
METHODS: A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines was proposed. PubMed, ERIC, Scopus, IEEE Xplore, EBSCO, Web of Science, ACM Digital Library, and ProQuest (Arts and Humanities) will be systematically searched for articles published since 2013 that examine RAI principles and ethical concerns within AI. Eligibility assessment will be conducted independently and coded data will be analyzed along themes and stratified across discipline-specific literature.
RESULTS: The results will be included in the full scoping review, which is expected to start in June 2024 and completed for the submission of publication by the end of 2024.
CONCLUSIONS: This scoping review will summarize the state of evidence and provide an overview of its impact, as well as strengths, weaknesses, and gaps in research implementing RAI principles. The review may also reveal discipline-specific concerns, priorities, and proposed solutions to the concerns. It will thereby identify priority areas that should be the focus of future regulatory options available, connecting theoretical aspects of ethical requirements for principles with practical solutions.
UNASSIGNED: PRR1-10.2196/52349.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
5 Identifying Substructures That Facilitate Compounds to Penetrate the Blood-Brain Barrier via Passive Transport Using Machine Learning Explainer Models.

使用机器学习解释器模型识别促进化合物通过被动运输穿透血脑屏障的子结构。影响指数 : 5.78
发表时间：06 2024 5
来源期刊：ACS Chem Neurosci PMID：38723285

DOI：10.1021/acschemneuro.3c00840
文章类型： Journal Article

使用局部可解释模型不可知解释（LIME）方法来解释穿透血脑屏障的化合物的两个机器学习模型。分类模型，随机森林,ExtraTrees,和深度残差网络，使用血脑屏障穿透数据集进行训练和验证，这显示了化合物在血脑屏障中的渗透性。LIME能够为这种穿透性创造解释，突出了影响药物在屏障中渗透的分子的最重要的亚结构。简单直观的输出证明了该可解释模型在分子特征方面解释化合物穿过血脑屏障的渗透性的适用性。用等于或大于0.1的权重过滤LIME解释,以仅获得最相关的解释。结果显示了几种对血脑屏障渗透很重要的结构。总的来说,发现一些具有含氮亚结构的化合物更有可能渗透血脑屏障。这些结构解释的应用可能有助于制药行业和潜在的药物合成研究小组更合理地合成活性分子。
The local interpretable model-agnostic explanation (LIME) method was used to interpret two machine learning models of compounds penetrating the blood-brain barrier. The classification models, Random Forest, ExtraTrees, and Deep Residual Network, were trained and validated using the blood-brain barrier penetration dataset, which shows the penetrability of compounds in the blood-brain barrier. LIME was able to create explanations for such penetrability, highlighting the most important substructures of molecules that affect drug penetration in the barrier. The simple and intuitive outputs prove the applicability of this explainable model to interpreting the permeability of compounds across the blood-brain barrier in terms of molecular features. LIME explanations were filtered with a weight equal to or greater than 0.1 to obtain only the most relevant explanations. The results showed several structures that are important for blood-brain barrier penetration. In general, it was found that some compounds with nitrogenous substructures are more likely to permeate the blood-brain barrier. The application of these structural explanations may help the pharmaceutical industry and potential drug synthesis research groups to synthesize active molecules more rationally.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
6 Reviewing the essential roles of remote phenotyping, GWAS and explainable AI in practical marker-assisted selection for drought-tolerant winter wheat breeding.

回顾远程表型的基本作用，GWAS 和可解释 AI 在耐旱性冬小麦育种中的实用标记辅助选择。影响指数 : 6.627
发表时间：2024
来源期刊：Front Plant Sci PMID：38699541

DOI：10.3389/fpls.2024.1319938
文章类型： Journal Article

标记辅助选择（MAS）在作物育种中起着至关重要的作用，通过快速可靠地识别和选择具有所需性状的植物来提高常规育种计划的速度和精度。然而,MAS的功效取决于几个先决条件，精确的表型是任何植物育种计划的关键方面。高通量远程表型的最新进展，由无人驾驶飞行器与机器学习相结合，提供一种非破坏性和有效的替代传统，耗时,和劳动密集型方法。此外，MAS依赖于标记-特征关联的知识，通常通过全基因组关联研究(GWAS)获得，为了理解复杂的特征，如耐旱性，包括产量成分和物候。然而,GWAS具有人工智能（AI）已被证明可以部分克服的局限性。此外,AI及其可解释的变体，确保透明度和可解释性，在整个育种过程中，越来越多地被用作公认的解决问题的工具。鉴于这些快速的技术进步，这篇综述概述了每个MAS的最新方法和流程，从表型，基因分型和关联分析，以将可解释的人工智能整合到整个工作流程中。在这种情况下，我们特别解决了育种冬小麦以获得更高的耐旱性和稳定产量的挑战和重要性，由于关键发育阶段的区域性干旱对冬小麦生产构成威胁。最后,我们探索从科学进步到实际实施的过渡，并讨论弥合前沿发展与育种者之间差距的方法，加快基于MAS的冬小麦抗旱育种。
Marker-assisted selection (MAS) plays a crucial role in crop breeding improving the speed and precision of conventional breeding programmes by quickly and reliably identifying and selecting plants with desired traits. However, the efficacy of MAS depends on several prerequisites, with precise phenotyping being a key aspect of any plant breeding programme. Recent advancements in high-throughput remote phenotyping, facilitated by unmanned aerial vehicles coupled to machine learning, offer a non-destructive and efficient alternative to traditional, time-consuming, and labour-intensive methods. Furthermore, MAS relies on knowledge of marker-trait associations, commonly obtained through genome-wide association studies (GWAS), to understand complex traits such as drought tolerance, including yield components and phenology. However, GWAS has limitations that artificial intelligence (AI) has been shown to partially overcome. Additionally, AI and its explainable variants, which ensure transparency and interpretability, are increasingly being used as recognised problem-solving tools throughout the breeding process. Given these rapid technological advancements, this review provides an overview of state-of-the-art methods and processes underlying each MAS, from phenotyping, genotyping and association analyses to the integration of explainable AI along the entire workflow. In this context, we specifically address the challenges and importance of breeding winter wheat for greater drought tolerance with stable yields, as regional droughts during critical developmental stages pose a threat to winter wheat production. Finally, we explore the transition from scientific progress to practical implementation and discuss ways to bridge the gap between cutting-edge developments and breeders, expediting MAS-based winter wheat breeding for drought tolerance.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
7 TOWARDS INTERPRETABLE SEIZURE DETECTION USING WEARABLES.

使用磨损进行可解释的癫痫发作检测。影响指数 : 暂无
发表时间：Jun 2023
来源期刊：Proc IEEE Int Conf Acoust Speech Signal Process PMID：38682049

DOI：10.1109/icassp49357.2023.10097091
文章类型： Journal Article

使用机器学习进行癫痫发作检测是癫痫及时干预和管理的关键问题。我们提议SeizFt,使用可穿戴设备的EEG的强大癫痫发作检测框架。它使用与树木合奏配对的功能，从而可以进一步解释模型的结果。还证明了潜在的增强和类平衡策略的有效性。这项研究是针对2023年癫痫发作检测挑战进行的，这是一项ICASP大挑战。
Seizure detection using machine learning is a critical problem for the timely intervention and management of epilepsy. We propose SeizFt, a robust seizure detection framework using EEG from a wearable device. It uses features paired with an ensemble of trees, thus enabling further interpretation of the model\'s results. The efficacy of the underlying augmentation and class-balancing strategy is also demonstrated. This study was performed for the Seizure Detection Challenge 2023, an ICASSP Grand Challenge.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
8 Transfer learning based approach for lung and colon cancer detection using local binary pattern features and explainable artificial intelligence (AI) techniques.

使用局部二进制模式特征和可解释的人工智能（ AI ）技术的基于迁移学习的肺癌和结肠癌检测方法。影响指数 : 2.411
发表时间：2024
来源期刊：PeerJ Comput Sci PMID：38660170

DOI：10.7717/peerj-cs.1996
文章类型： Journal Article

癌症,由遗传异常和代谢异常引起的危及生命的疾病，是对健康的重大危害，肺癌和结肠癌是死亡的主要原因。组织病理学鉴定对于指导这些癌症的有效治疗方案至关重要。这些疾病越早被发现，死亡的风险越小。机器学习和深度学习方法的使用有可能通过允许研究人员快速，经济地分析大型患者数据库来加快癌症诊断过程。这项研究引入了Inception-ResNetV2模型，该模型具有战略性地结合了局部二进制模式（LBP）特征，以提高肺癌和结肠癌识别的诊断准确性。该模型是在组织病理学图像上训练的，深度学习和基于纹理的特征的集成已经证明了其卓越的性能，准确率为99.98%。重要的是,该研究通过Shapley加法扩张（SHAP）采用可解释的人工智能（AI）来揭示深度学习模型的复杂内部运作，在决策过程中提供透明度。这项研究强调了在更准确和可靠的医学评估时代彻底改变癌症诊断的潜力。
Cancer, a life-threatening disorder caused by genetic abnormalities and metabolic irregularities, is a substantial health danger, with lung and colon cancer being major contributors to death. Histopathological identification is critical in directing effective treatment regimens for these cancers. The earlier these disorders are identified, the lesser the risk of death. The use of machine learning and deep learning approaches has the potential to speed up cancer diagnosis processes by allowing researchers to analyse large patient databases quickly and affordably. This study introduces the Inception-ResNetV2 model with strategically incorporated local binary patterns (LBP) features to improve diagnostic accuracy for lung and colon cancer identification. The model is trained on histopathological images, and the integration of deep learning and texture-based features has demonstrated its exceptional performance with 99.98% accuracy. Importantly, the study employs explainable artificial intelligence (AI) through SHapley Additive exPlanations (SHAP) to unravel the complex inner workings of deep learning models, providing transparency in decision-making processes. This study highlights the potential to revolutionize cancer diagnosis in an era of more accurate and reliable medical assessments.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
9 Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition.

使用测试时间增加来调查可解释的 AI ：方法之间的不一致，模型和人类直觉。影响指数 : 8.489
发表时间：Apr 2024 4
来源期刊：J Cheminform PMID：38576047

DOI：10.1186/s13321-024-00824-1
文章类型： Journal Article

机器学习模型的利益相关者希望可解释的人工智能（XAI）产生人类可理解和一致的解释。在计算毒性方面，基于文本的分子表示的增强已成功用于下游任务的迁移学习。分子表示的增强也可以用于推理，以比较相同地面实况的多个表示之间的差异。在这项研究中,我们研究了8种XAI方法在计算毒性预测领域中的分子表示模型的测试时间增加的稳健性.我们报告了对相同基础事实的不同表示的解释之间的显着差异，并表明随机模型具有相似的方差。我们假设在这项研究和过去的研究中，基于文本的分子表示比学习参数更能反映标记化。此外，我们看到域内预测比域外预测之间的差异更大，指示XAI测量的不是学习的参数。最后,我们调查了专家派生的结构警报的相对重要性，并发现了类似的重要性，无论适用性领域如何，随机化和不同的训练程序。因此，我们提醒未来的研究使用与人类直觉类似的比较来验证他们的方法，而无需进一步调查。科学贡献：在这项研究中，我们通过增加测试时间来批判性地研究XAI，对比以前关于使用专家验证和在相同表示的模型中显示不一致的假设。SMILES增强已用于提高模型准确性，但在此改编自图像测试时间增强领域，以用作基于SMILES的分子表示模型内一致性的独立指标。
Stakeholders of machine learning models desire explainable artificial intelligence (XAI) to produce human-understandable and consistent interpretations. In computational toxicity, augmentation of text-based molecular representations has been used successfully for transfer learning on downstream tasks. Augmentations of molecular representations can also be used at inference to compare differences between multiple representations of the same ground-truth. In this study, we investigate the robustness of eight XAI methods using test-time augmentation for a molecular-representation model in the field of computational toxicity prediction. We report significant differences between explanations for different representations of the same ground-truth, and show that randomized models have similar variance. We hypothesize that text-based molecular representations in this and past research reflect tokenization more than learned parameters. Furthermore, we see a greater variance between in-domain predictions than out-of-domain predictions, indicating XAI measures something other than learned parameters. Finally, we investigate the relative importance given to expert-derived structural alerts and find similar importance given irregardless of applicability domain, randomization and varying training procedures. We therefore caution future research to validate their methods using a similar comparison to human intuition without further investigation. SCIENTIFIC CONTRIBUTION: In this research we critically investigate XAI through test-time augmentation, contrasting previous assumptions about using expert validation and showing inconsistencies within models for identical representations. SMILES augmentation has been used to increase model accuracy, but was here adapted from the field of image test-time augmentation to be used as an independent indication of the consistency within SMILES-based molecular representation models.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
10 Benchmarking the influence of pre-training on explanation performance in MR image classification.

对标预训练对 MR 图像分类中解释性能的影响。影响指数 : 暂无
发表时间：2024
来源期刊：Front Artif Intell PMID：38469161

DOI：10.3389/frai.2024.1330919
文章类型： Journal Article

卷积神经网络（CNN）经常成功地用于医学预测任务。它们通常与迁移学习结合使用，导致在任务的训练数据稀缺时提高性能。由此产生的模型非常复杂，通常不提供对其预测机制的任何洞察，激励“可解释”人工智能(XAI)领域。然而,以前的研究很少定量评估XAI方法对地面实况数据的“解释性能”，迁移学习及其对解释绩效的客观度量的影响尚未得到研究。这里,我们提出了一个基准数据集，该数据集允许在现实的磁共振成像（MRI）分类任务中量化解释性能。我们使用此基准来了解迁移学习对解释质量的影响。实验结果表明，应用于同一底层模型的流行XAI方法在性能上差异很大，即使只考虑正确分类的例子。我们进一步观察到，解释性能在很大程度上取决于用于预训练的任务和预训练的CNN层的数量。这些结果在校正解释和分类性能之间的实质性相关性后成立。
Convolutional Neural Networks (CNNs) are frequently and successfully used in medical prediction tasks. They are often used in combination with transfer learning, leading to improved performance when training data for the task are scarce. The resulting models are highly complex and typically do not provide any insight into their predictive mechanisms, motivating the field of \"explainable\" artificial intelligence (XAI). However, previous studies have rarely quantitatively evaluated the \"explanation performance\" of XAI methods against ground-truth data, and transfer learning and its influence on objective measures of explanation performance has not been investigated. Here, we propose a benchmark dataset that allows for quantifying explanation performance in a realistic magnetic resonance imaging (MRI) classification task. We employ this benchmark to understand the influence of transfer learning on the quality of explanations. Experimental results show that popular XAI methods applied to the same underlying model differ vastly in performance, even when considering only correctly classified examples. We further observe that explanation performance strongly depends on the task used for pre-training and the number of CNN layers pre-trained. These results hold after correcting for a substantial correlation between explanation and classification performance.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)

XAI 关注

1 A review of evaluation approaches for explainable AI with applications in cardiology.

2 A BERT-GNN Approach for Metastatic Breast Cancer Prediction Using Histopathology Reports.

3 A tree-based explainable AI model for early detection of Covid-19 using physiological data.

4 Impact of Responsible AI on the Occurrence and Resolution of Ethical Issues: Protocol for a Scoping Review.

5 Identifying Substructures That Facilitate Compounds to Penetrate the Blood-Brain Barrier via Passive Transport Using Machine Learning Explainer Models.

6 Reviewing the essential roles of remote phenotyping, GWAS and explainable AI in practical marker-assisted selection for drought-tolerant winter wheat breeding.

7 TOWARDS INTERPRETABLE SEIZURE DETECTION USING WEARABLES.

8 Transfer learning based approach for lung and colon cancer detection using local binary pattern features and explainable artificial intelligence (AI) techniques.

9 Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition.

10 Benchmarking the influence of pre-training on explanation performance in MR image classification.