人类注意力引导计算机视觉模型的可解释人工智能。Human attention guided explainable artificial intelligence for computer vision models.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

Explainable artificial intelligence (XAI) has been increasingly investigated to enhance the transparency of black-box artificial intelligence models, promoting better user understanding and trust. Developing an XAI that is faithful to models and plausible to users is both a necessity and a challenge. This work examines whether embedding human attention knowledge into saliency-based XAI methods for computer vision models could enhance their plausibility and faithfulness. Two novel XAI methods for object detection models, namely FullGrad-CAM and FullGrad-CAM++, were first developed to generate object-specific explanations by extending the current gradient-based XAI methods for image classification models. Using human attention as the objective plausibility measure, these methods achieve higher explanation plausibility. Interestingly, all current XAI methods when applied to object detection models generally produce saliency maps that are less faithful to the model than human attention maps from the same object detection task. Accordingly, human attention-guided XAI (HAG-XAI) was proposed to learn from human attention how to best combine explanatory information from the models to enhance explanation plausibility by using trainable activation functions and smoothing kernels to maximize the similarity between XAI saliency map and human attention map. The proposed XAI methods were evaluated on widely used BDD-100K, MS-COCO, and ImageNet datasets and compared with typical gradient-based and perturbation-based XAI methods. Results suggest that HAG-XAI enhanced explanation plausibility and user trust at the expense of faithfulness for image classification models, and it enhanced plausibility, faithfulness, and user trust simultaneously and outperformed existing state-of-the-art XAI methods for object detection models.

摘要：

可解释人工智能（XAI）已被越来越多地研究，以提高黑盒人工智能模型的透明度，促进用户更好的理解和信任。开发一个忠实于模型并对用户合理的XAI既是必要的，也是挑战。这项工作研究了将人类注意力知识嵌入到用于计算机视觉模型的基于显着性的XAI方法中是否可以增强其真实性和真实性。用于对象检测模型的两种新颖的XAI方法，即FullGrad-CAM和FullGrad-CAM++，首先开发了通过扩展当前基于梯度的XAI方法用于图像分类模型来生成特定于对象的解释。使用人类注意力作为客观的合理性度量，这些方法实现了更高的解释合理性。有趣的是,当应用于对象检测模型时，所有当前的XAI方法通常会产生比来自相同对象检测任务的人类注意力图更不忠实于模型的显著性图。因此，提出了人类注意力引导的XAI（HAG-XAI），以从人类注意力中学习如何通过使用可训练的激活函数和平滑内核来最佳地结合模型中的解释性信息以增强解释的合理性，以最大化XAI显著性图和人类注意力图之间的相似性。提出的XAI方法在广泛使用的BDD-100K上进行了评估，MS-COCO,和ImageNet数据集，并与典型的基于梯度和基于扰动的XAI方法进行比较。结果表明，HAG-XAI以牺牲图像分类模型的忠诚度为代价，增强了解释的合理性和用户的信任度，它增强了可信度，忠诚,同时和用户信任，并优于现有的用于对象检测模型的最新XAI方法。