XAI

XAI
  • 文章类型: Journal Article
    转移性乳腺癌(MBC)仍然是女性癌症相关死亡的主要原因。这项工作介绍了一种创新的非侵入性乳腺癌分类模型,旨在改善癌症转移的识别。虽然这项研究标志着预测MBC的初步探索,额外的调查对于验证MBC的发生至关重要.我们的方法结合了大型语言模型(LLM)的优势,特别是来自变压器(BERT)模型的双向编码器表示,图神经网络(GNN)的强大功能,可根据组织病理学报告预测MBC患者。本文介绍了一种用于转移性乳腺癌预测(BG-MBC)的BERT-GNN方法,该方法集成了从BERT模型得出的图形信息。在这个模型中,节点是根据病人的医疗记录构建的,虽然BERT嵌入被用来对组织病理学报告中的单词进行矢量化表示,从而通过采用三种不同的方法(即单变量选择,用于特征重要性的额外树分类器,和Shapley值,以确定影响最显著的特征)。确定在模型训练期间作为嵌入生成的676个中最关键的30个特征,我们的模型进一步增强了其预测能力。BG-MBC模型具有出色的准确性,在识别MBC患者时,检出率为0.98,曲线下面积(AUC)为0.98。这种显著的表现归功于模型对LLM从组织病理学报告中产生的注意力得分的利用,有效地捕获相关特征进行分类。
    Metastatic breast cancer (MBC) continues to be a leading cause of cancer-related deaths among women. This work introduces an innovative non-invasive breast cancer classification model designed to improve the identification of cancer metastases. While this study marks the initial exploration into predicting MBC, additional investigations are essential to validate the occurrence of MBC. Our approach combines the strengths of large language models (LLMs), specifically the bidirectional encoder representations from transformers (BERT) model, with the powerful capabilities of graph neural networks (GNNs) to predict MBC patients based on their histopathology reports. This paper introduces a BERT-GNN approach for metastatic breast cancer prediction (BG-MBC) that integrates graph information derived from the BERT model. In this model, nodes are constructed from patient medical records, while BERT embeddings are employed to vectorise representations of the words in histopathology reports, thereby capturing semantic information crucial for classification by employing three distinct approaches (namely univariate selection, extra trees classifier for feature importance, and Shapley values to identify the features that have the most significant impact). Identifying the most crucial 30 features out of 676 generated as embeddings during model training, our model further enhances its predictive capabilities. The BG-MBC model achieves outstanding accuracy, with a detection rate of 0.98 and an area under curve (AUC) of 0.98, in identifying MBC patients. This remarkable performance is credited to the model\'s utilisation of attention scores generated by the LLM from histopathology reports, effectively capturing pertinent features for classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    随着2020年COVID-19的爆发,世界各国面临着重大的担忧和挑战。利用人工智能(AI)和数据科学技术进行疾病检测的各种研究已经出现。尽管COVID-19病例有所下降,世界各地仍然有病例和死亡。因此,在症状出现之前早期检测COVID-19对于减少其广泛影响至关重要。幸运的是,智能手表等可穿戴设备已被证明是有价值的生理数据来源,包括心率(HR)和睡眠质量,能够检测炎症性疾病。在这项研究中,我们利用已经存在的数据集,包括个体步数和心率数据,预测症状出现前COVID-19感染的概率.我们训练三个主要的模型架构:梯度提升分类器(GB)、CatBoost树,和TabNet分类器来分析生理数据并比较它们各自的表现。我们还在我们表现最好的模型中添加了一个可解释性层,这澄清了预测结果,并允许对有效性进行详细评估。此外,我们通过从Fitbit设备收集生理数据来创建私有数据集,以保证可靠性并避免偏差.然后使用相同的预训练模型将相同的模型集应用于该私有数据集,并记录了结果。使用基于CatBoost树的方法,我们表现最好的模型在公开数据集上的准确率为85%,优于以往的研究.此外,当应用于私有数据集时,这个相同的预训练CatBoost模型产生了81%的准确率。您可以在链接中找到源代码:https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data。git.
    With the outbreak of COVID-19 in 2020, countries worldwide faced significant concerns and challenges. Various studies have emerged utilizing Artificial Intelligence (AI) and Data Science techniques for disease detection. Although COVID-19 cases have declined, there are still cases and deaths around the world. Therefore, early detection of COVID-19 before the onset of symptoms has become crucial in reducing its extensive impact. Fortunately, wearable devices such as smartwatches have proven to be valuable sources of physiological data, including Heart Rate (HR) and sleep quality, enabling the detection of inflammatory diseases. In this study, we utilize an already-existing dataset that includes individual step counts and heart rate data to predict the probability of COVID-19 infection before the onset of symptoms. We train three main model architectures: the Gradient Boosting classifier (GB), CatBoost trees, and TabNet classifier to analyze the physiological data and compare their respective performances. We also add an interpretability layer to our best-performing model, which clarifies prediction results and allows a detailed assessment of effectiveness. Moreover, we created a private dataset by gathering physiological data from Fitbit devices to guarantee reliability and avoid bias.The identical set of models was then applied to this private dataset using the same pre-trained models, and the results were documented. Using the CatBoost tree-based method, our best-performing model outperformed previous studies with an accuracy rate of 85% on the publicly available dataset. Furthermore, this identical pre-trained CatBoost model produced an accuracy of 81% when applied to the private dataset. You will find the source code in the link: https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data.git .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:负责任的人工智能(RAI)强调使用实施问责制的道德框架,责任,和透明度,以解决部署和使用人工智能(AI)技术的问题,包括隐私,自主性,自决,偏见,和透明度。鉴于这些考虑,正在制定标准来指导人工智能的支持和实施。
    目的:本综述的目的是提供有关RAI原则的实施以及AI系统中道德问题的发生和解决的当前研究证据和知识差距的概述。
    方法:提出了针对系统评论的首选报告项目和针对范围审查的Meta分析扩展(PRISMA-ScR)指南的范围审查。PubMed,ERIC,Scopus,IEEEXplore,EBSCO,WebofScience,ACM数字图书馆,将系统地搜索自2013年以来发表的文章,以研究AI中的RAI原则和道德问题。资格评估将独立进行,编码数据将按照主题进行分析,并跨特定学科的文献进行分层。
    结果:结果将包含在完整的范围审查中,预计将于2024年6月开始,并在2024年底前完成出版物的提交。
    结论:本范围审查将总结证据的状态,并提供其影响的概述,以及优势,弱点,以及实施RAI原则的研究差距。审查还可能揭示特定学科的担忧,优先事项,并提出了解决这些问题的方法。因此,它将确定应成为未来可用监管选择重点的优先领域,将原则的伦理要求的理论方面与实际解决方案联系起来。
    PRR1-10.2196/52349。
    BACKGROUND: Responsible artificial intelligence (RAI) emphasizes the use of ethical frameworks implementing accountability, responsibility, and transparency to address concerns in the deployment and use of artificial intelligence (AI) technologies, including privacy, autonomy, self-determination, bias, and transparency. Standards are under development to guide the support and implementation of AI given these considerations.
    OBJECTIVE: The purpose of this review is to provide an overview of current research evidence and knowledge gaps regarding the implementation of RAI principles and the occurrence and resolution of ethical issues within AI systems.
    METHODS: A scoping review following Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines was proposed. PubMed, ERIC, Scopus, IEEE Xplore, EBSCO, Web of Science, ACM Digital Library, and ProQuest (Arts and Humanities) will be systematically searched for articles published since 2013 that examine RAI principles and ethical concerns within AI. Eligibility assessment will be conducted independently and coded data will be analyzed along themes and stratified across discipline-specific literature.
    RESULTS: The results will be included in the full scoping review, which is expected to start in June 2024 and completed for the submission of publication by the end of 2024.
    CONCLUSIONS: This scoping review will summarize the state of evidence and provide an overview of its impact, as well as strengths, weaknesses, and gaps in research implementing RAI principles. The review may also reveal discipline-specific concerns, priorities, and proposed solutions to the concerns. It will thereby identify priority areas that should be the focus of future regulatory options available, connecting theoretical aspects of ethical requirements for principles with practical solutions.
    UNASSIGNED: PRR1-10.2196/52349.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    可解释人工智能(XAI)已被越来越多地研究,以提高黑盒人工智能模型的透明度,促进用户更好的理解和信任。开发一个忠实于模型并对用户合理的XAI既是必要的,也是挑战。这项工作研究了将人类注意力知识嵌入到用于计算机视觉模型的基于显着性的XAI方法中是否可以增强其真实性和真实性。用于对象检测模型的两种新颖的XAI方法,即FullGrad-CAM和FullGrad-CAM++,首先开发了通过扩展当前基于梯度的XAI方法用于图像分类模型来生成特定于对象的解释。使用人类注意力作为客观的合理性度量,这些方法实现了更高的解释合理性。有趣的是,当应用于对象检测模型时,所有当前的XAI方法通常会产生比来自相同对象检测任务的人类注意力图更不忠实于模型的显著性图。因此,提出了人类注意力引导的XAI(HAG-XAI),以从人类注意力中学习如何通过使用可训练的激活函数和平滑内核来最佳地结合模型中的解释性信息以增强解释的合理性,以最大化XAI显著性图和人类注意力图之间的相似性。提出的XAI方法在广泛使用的BDD-100K上进行了评估,MS-COCO,和ImageNet数据集,并与典型的基于梯度和基于扰动的XAI方法进行比较。结果表明,HAG-XAI以牺牲图像分类模型的忠诚度为代价,增强了解释的合理性和用户的信任度,它增强了可信度,忠诚,同时和用户信任,并优于现有的用于对象检测模型的最新XAI方法。
    Explainable artificial intelligence (XAI) has been increasingly investigated to enhance the transparency of black-box artificial intelligence models, promoting better user understanding and trust. Developing an XAI that is faithful to models and plausible to users is both a necessity and a challenge. This work examines whether embedding human attention knowledge into saliency-based XAI methods for computer vision models could enhance their plausibility and faithfulness. Two novel XAI methods for object detection models, namely FullGrad-CAM and FullGrad-CAM++, were first developed to generate object-specific explanations by extending the current gradient-based XAI methods for image classification models. Using human attention as the objective plausibility measure, these methods achieve higher explanation plausibility. Interestingly, all current XAI methods when applied to object detection models generally produce saliency maps that are less faithful to the model than human attention maps from the same object detection task. Accordingly, human attention-guided XAI (HAG-XAI) was proposed to learn from human attention how to best combine explanatory information from the models to enhance explanation plausibility by using trainable activation functions and smoothing kernels to maximize the similarity between XAI saliency map and human attention map. The proposed XAI methods were evaluated on widely used BDD-100K, MS-COCO, and ImageNet datasets and compared with typical gradient-based and perturbation-based XAI methods. Results suggest that HAG-XAI enhanced explanation plausibility and user trust at the expense of faithfulness for image classification models, and it enhanced plausibility, faithfulness, and user trust simultaneously and outperformed existing state-of-the-art XAI methods for object detection models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    使用局部可解释模型不可知解释(LIME)方法来解释穿透血脑屏障的化合物的两个机器学习模型。分类模型,随机森林,ExtraTrees,和深度残差网络,使用血脑屏障穿透数据集进行训练和验证,这显示了化合物在血脑屏障中的渗透性。LIME能够为这种穿透性创造解释,突出了影响药物在屏障中渗透的分子的最重要的亚结构。简单直观的输出证明了该可解释模型在分子特征方面解释化合物穿过血脑屏障的渗透性的适用性。用等于或大于0.1的权重过滤LIME解释,以仅获得最相关的解释。结果显示了几种对血脑屏障渗透很重要的结构。总的来说,发现一些具有含氮亚结构的化合物更有可能渗透血脑屏障。这些结构解释的应用可能有助于制药行业和潜在的药物合成研究小组更合理地合成活性分子。
    The local interpretable model-agnostic explanation (LIME) method was used to interpret two machine learning models of compounds penetrating the blood-brain barrier. The classification models, Random Forest, ExtraTrees, and Deep Residual Network, were trained and validated using the blood-brain barrier penetration dataset, which shows the penetrability of compounds in the blood-brain barrier. LIME was able to create explanations for such penetrability, highlighting the most important substructures of molecules that affect drug penetration in the barrier. The simple and intuitive outputs prove the applicability of this explainable model to interpreting the permeability of compounds across the blood-brain barrier in terms of molecular features. LIME explanations were filtered with a weight equal to or greater than 0.1 to obtain only the most relevant explanations. The results showed several structures that are important for blood-brain barrier penetration. In general, it was found that some compounds with nitrogenous substructures are more likely to permeate the blood-brain barrier. The application of these structural explanations may help the pharmaceutical industry and potential drug synthesis research groups to synthesize active molecules more rationally.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    标记辅助选择(MAS)在作物育种中起着至关重要的作用,通过快速可靠地识别和选择具有所需性状的植物来提高常规育种计划的速度和精度。然而,MAS的功效取决于几个先决条件,精确的表型是任何植物育种计划的关键方面。高通量远程表型的最新进展,由无人驾驶飞行器与机器学习相结合,提供一种非破坏性和有效的替代传统,耗时,和劳动密集型方法。此外,MAS依赖于标记-特征关联的知识,通常通过全基因组关联研究(GWAS)获得,为了理解复杂的特征,如耐旱性,包括产量成分和物候。然而,GWAS具有人工智能(AI)已被证明可以部分克服的局限性。此外,AI及其可解释的变体,确保透明度和可解释性,在整个育种过程中,越来越多地被用作公认的解决问题的工具。鉴于这些快速的技术进步,这篇综述概述了每个MAS的最新方法和流程,从表型,基因分型和关联分析,以将可解释的人工智能整合到整个工作流程中。在这种情况下,我们特别解决了育种冬小麦以获得更高的耐旱性和稳定产量的挑战和重要性,由于关键发育阶段的区域性干旱对冬小麦生产构成威胁。最后,我们探索从科学进步到实际实施的过渡,并讨论弥合前沿发展与育种者之间差距的方法,加快基于MAS的冬小麦抗旱育种。
    Marker-assisted selection (MAS) plays a crucial role in crop breeding improving the speed and precision of conventional breeding programmes by quickly and reliably identifying and selecting plants with desired traits. However, the efficacy of MAS depends on several prerequisites, with precise phenotyping being a key aspect of any plant breeding programme. Recent advancements in high-throughput remote phenotyping, facilitated by unmanned aerial vehicles coupled to machine learning, offer a non-destructive and efficient alternative to traditional, time-consuming, and labour-intensive methods. Furthermore, MAS relies on knowledge of marker-trait associations, commonly obtained through genome-wide association studies (GWAS), to understand complex traits such as drought tolerance, including yield components and phenology. However, GWAS has limitations that artificial intelligence (AI) has been shown to partially overcome. Additionally, AI and its explainable variants, which ensure transparency and interpretability, are increasingly being used as recognised problem-solving tools throughout the breeding process. Given these rapid technological advancements, this review provides an overview of state-of-the-art methods and processes underlying each MAS, from phenotyping, genotyping and association analyses to the integration of explainable AI along the entire workflow. In this context, we specifically address the challenges and importance of breeding winter wheat for greater drought tolerance with stable yields, as regional droughts during critical developmental stages pose a threat to winter wheat production. Finally, we explore the transition from scientific progress to practical implementation and discuss ways to bridge the gap between cutting-edge developments and breeders, expediting MAS-based winter wheat breeding for drought tolerance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    使用机器学习进行癫痫发作检测是癫痫及时干预和管理的关键问题。我们提议SeizFt,使用可穿戴设备的EEG的强大癫痫发作检测框架。它使用与树木合奏配对的功能,从而可以进一步解释模型的结果。还证明了潜在的增强和类平衡策略的有效性。这项研究是针对2023年癫痫发作检测挑战进行的,这是一项ICASP大挑战。
    Seizure detection using machine learning is a critical problem for the timely intervention and management of epilepsy. We propose SeizFt, a robust seizure detection framework using EEG from a wearable device. It uses features paired with an ensemble of trees, thus enabling further interpretation of the model\'s results. The efficacy of the underlying augmentation and class-balancing strategy is also demonstrated. This study was performed for the Seizure Detection Challenge 2023, an ICASSP Grand Challenge.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    癌症,由遗传异常和代谢异常引起的危及生命的疾病,是对健康的重大危害,肺癌和结肠癌是死亡的主要原因。组织病理学鉴定对于指导这些癌症的有效治疗方案至关重要。这些疾病越早被发现,死亡的风险越小。机器学习和深度学习方法的使用有可能通过允许研究人员快速,经济地分析大型患者数据库来加快癌症诊断过程。这项研究引入了Inception-ResNetV2模型,该模型具有战略性地结合了局部二进制模式(LBP)特征,以提高肺癌和结肠癌识别的诊断准确性。该模型是在组织病理学图像上训练的,深度学习和基于纹理的特征的集成已经证明了其卓越的性能,准确率为99.98%。重要的是,该研究通过Shapley加法扩张(SHAP)采用可解释的人工智能(AI)来揭示深度学习模型的复杂内部运作,在决策过程中提供透明度。这项研究强调了在更准确和可靠的医学评估时代彻底改变癌症诊断的潜力。
    Cancer, a life-threatening disorder caused by genetic abnormalities and metabolic irregularities, is a substantial health danger, with lung and colon cancer being major contributors to death. Histopathological identification is critical in directing effective treatment regimens for these cancers. The earlier these disorders are identified, the lesser the risk of death. The use of machine learning and deep learning approaches has the potential to speed up cancer diagnosis processes by allowing researchers to analyse large patient databases quickly and affordably. This study introduces the Inception-ResNetV2 model with strategically incorporated local binary patterns (LBP) features to improve diagnostic accuracy for lung and colon cancer identification. The model is trained on histopathological images, and the integration of deep learning and texture-based features has demonstrated its exceptional performance with 99.98% accuracy. Importantly, the study employs explainable artificial intelligence (AI) through SHapley Additive exPlanations (SHAP) to unravel the complex inner workings of deep learning models, providing transparency in decision-making processes. This study highlights the potential to revolutionize cancer diagnosis in an era of more accurate and reliable medical assessments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在快速发展的人工智能(AI)领域,传统上,可解释性是在建模后的过程中进行评估的,并且通常是主观的。相反,许多定量指标通常被用来评估模型的性能。我们提出了一个统一的公式,名为PERForm,通过将可解释性作为权重纳入现有的统计指标,以提供对预测性和可解释性的综合和定量度量,以指导模型选择,应用程序,和评价。PERForm被设计为通用公式,可以应用于任何数据类型。我们在一系列不同的数据集上应用了PERForm,包括Dilist,Tox21和三个MAQC-II基准数据集,使用各种建模算法来预测总共73个不同的端点。例如,AdaBoost算法在DILIst预测中表现出卓越的性能(AdaBoost的PERFormAUC为0.129,其中线性回归为0),在大多数Tox21终点中,线性回归优于其他模型(线性回归的PERFormAUC为0.301,其中AdaBoost平均为0.283)。这项研究标志着朝着全面评估AI模型的实用性迈出了重要一步,以提高透明度和可解释性。其中模型的性能与其可解释性之间的权衡可能会产生深远的影响。
    In the rapidly evolving field of artificial intelligence (AI), explainability has been traditionally assessed in a post-modeling process and is often subjective. In contrary, many quantitative metrics have been routinely used to assess a model\'s performance. We proposed a unified formular named PERForm, by incorporating explainability as a weight into the existing statistical metrics to provide an integrated and quantitative measure of both predictivity and explainability to guide model selection, application, and evaluation. PERForm was designed as a generic formula and can be applied to any data types. We applied PERForm on a range of diverse datasets, including DILIst, Tox21, and three MAQC-II benchmark datasets, using various modeling algorithms to predict a total of 73 distinct endpoints. For example, AdaBoost algorithms exhibited superior performance (PERForm AUC for AdaBoost is 0.129 where Linear regression is 0) in DILIst prediction, where linear regression outperformed other models in the majority of Tox21 endpoints (PERForm AUC for linear regression is 0.301 where AdaBoost is 0.283 in average). This research marks a significant step toward comprehensively evaluating the utility of an AI model to advance transparency and interpretability, where the tradeoff between a model\'s performance and its interpretability can have profound implications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号