knowledge base

知识库
  • 文章类型: Journal Article
    随着深度学习技术的快速发展,这些应用在各个领域变得越来越广泛。然而,传统的深度学习方法通常被称为“黑箱”模型,其结果的可解释性较低,对它们在某些关键领域的应用提出了挑战。在这项研究中,我们提出了一种情感模型可解释性分析的综合方法。所提出的方法包括两个主要方面:基于注意力的分析和外部知识集成。首先,我们在情感分类和生成任务中训练模型,以从多个角度捕获注意力得分。这种多角度的方法减少了偏见,并提供了对潜在情绪的更全面理解。第二,我们整合了一个外部知识库来改进证据提取。通过利用角色得分,我们检索完整的情感证据短语,解决中文文本中证据提取不完全的挑战。在情感可解释性评估数据集上的实验结果表明了我们方法的有效性。我们观察到准确率显著提高了1.3%,宏F1下降13%,MAP下降23%。总的来说,我们的方法通过结合基于注意力的分析和外部知识的整合,为增强情绪模型的可解释性提供了一个稳健的解决方案.
    With the rapid development of deep learning techniques, the applications have become increasingly widespread in various domains. However, traditional deep learning methods are often referred to as \"black box\" models with low interpretability of their results, posing challenges for their application in certain critical domains. In this study, we propose a comprehensive method for the interpretability analysis of sentiment models. The proposed method encompasses two main aspects: attention-based analysis and external knowledge integration. First, we train the model within sentiment classification and generation tasks to capture attention scores from multiple perspectives. This multi-angle approach reduces bias and provides a more comprehensive understanding of the underlying sentiment. Second, we incorporate an external knowledge base to improve evidence extraction. By leveraging character scores, we retrieve complete sentiment evidence phrases, addressing the challenge of incomplete evidence extraction in Chinese texts. Experimental results on a sentiment interpretability evaluation dataset demonstrate the effectiveness of our method. We observe a notable increase in accuracy by 1.3%, Macro-F1 by 13%, and MAP by 23%. Overall, our approach offers a robust solution for enhancing the interpretability of sentiment models by combining attention-based analysis and the integration of external knowledge.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基因的标准化命名法,基因产物,同工型对于防止歧义和实现科学数据的清晰交流至关重要,促进有效的生物存储和数据共享。标准化基因型命名法,它描述了特定菌株中存在的与野生型参考菌株不同的等位基因,对于最大化研究影响并确保将基因型与表型联系起来的结果是可察觉的,可访问,互操作,可重用(FAIR)。在本出版物中,我们扩展了裂变酵母进化枝基因命名指南,以支持PomBase的策展工作(www.pombase.org),裂殖酵母模型生物数据库。此更新介绍了非编码RNA基因的命名指南,遵循人类基因组组织基因命名委员会的规定。此外,我们对最初于1987年发布的等位基因和基因型命名指南进行了重大更新,以标准化裂变酵母遗传工具箱所实现的各种遗传修饰范围。这些更新的指南反映了许多裂变酵母研究人员之间的社区共识。采用这些规则将提高基因和基因型命名法的一致性,并促进机器可读性和自动化实体识别裂变酵母基因和等位基因在出版物或数据集。总之,我们更新的指南为裂变酵母研究界提供了宝贵的资源,促进一致性,清晰度,遗传数据共享和解释中的公平。
    Standardized nomenclature for genes, gene products, and isoforms is crucial to prevent ambiguity and enable clear communication of scientific data, facilitating efficient biocuration and data sharing. Standardized genotype nomenclature, which describes alleles present in a specific strain that differ from those in the wild-type reference strain, is equally essential to maximize research impact and ensure that results linking genotypes to phenotypes are Findable, Accessible, Interoperable, and Reusable (FAIR). In this publication, we extend the fission yeast clade gene nomenclature guidelines to support the curation efforts at PomBase (www.pombase.org), the Schizosaccharomyces pombe Model Organism Database. This update introduces nomenclature guidelines for noncoding RNA genes, following those set forth by the Human Genome Organisation Gene Nomenclature Committee. Additionally, we provide a significant update to the allele and genotype nomenclature guidelines originally published in 1987, to standardize the diverse range of genetic modifications enabled by the fission yeast genetic toolbox. These updated guidelines reflect a community consensus between numerous fission yeast researchers. Adoption of these rules will improve consistency in gene and genotype nomenclature, and facilitate machine-readability and automated entity recognition of fission yeast genes and alleles in publications or datasets. In conclusion, our updated guidelines provide a valuable resource for the fission yeast research community, promoting consistency, clarity, and FAIRness in genetic data sharing and interpretation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    治疗方法的选择和预后评估取决于脑肿瘤的准确早期诊断。由于与临床实践中手动评估磁共振成像(MRI)图像相关的挑战,许多脑肿瘤无法诊断或被临床医生忽视。在这项研究中,我们建立了一个计算机辅助诊断(CAD)系统来检测胶质瘤,分级,分割,和基于人工智能算法的知识发现。使用称为梯度直方图(HOG)的一类视觉特征来具体表示神经图像。然后,通过两级分类框架,HOG特征用于区分健康对照和患者,或不同级别的神经胶质瘤.该CAD系统还使用半自动分割工具提供肿瘤可视化,以实现更好的患者管理和治疗监测。最后,建立了一个知识库,为脑肿瘤的诊断提供额外的建议。基于我们提出的两级分类框架,我们训练神经胶质瘤检测和分级的模型,曲线下面积(AUC)分别为0.921和0.806。与其他系统不同,我们将这些诊断工具与基于Web的界面集成在一起,这为系统部署提供了灵活性。
    The choice of treatment and prognosis evaluation depend on the accurate early diagnosis of brain tumors. Many brain tumors go undiagnosed or are overlooked by clinicians as a result of the challenges associated with manually evaluating magnetic resonance imaging (MRI) images in clinical practice. In this study, we built a computer-aided diagnosis (CAD) system for glioma detection, grading, segmentation, and knowledge discovery based on artificial intelligence algorithms. Neuroimages are specifically represented using a type of visual feature known as the histogram of gradients (HOG). Then, through a two-level classification framework, the HOG features are employed to distinguish between healthy controls and patients, or between different glioma grades. This CAD system also offers tumor visualization using a semi-automatic segmentation tool for better patient management and treatment monitoring. Finally, a knowledge base is created to offer additional advice for the diagnosis of brain tumors. Based on our proposed two-level classification framework, we train models for glioma detection and grading, achieving area under curve (AUC) of 0.921 and 0.806, respectively. Different from other systems, we integrate these diagnostic tools with a web-based interface, which provides the flexibility for system deployment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在诊所,放射学报告对于指导患者的治疗至关重要。然而,撰写放射学报告对放射科医生来说是沉重的负担。为此,我们提出了一个自动的,从胸部X射线生成报告的多模态方法。我们的方法,由于观察到放射学报告中的描述与X射线图像的特定信息高度相关,具有两个不同的模块:(i)学习知识库:吸收放射学报告中嵌入的知识,我们建立了一个知识库,可以从文本嵌入中自动提取和恢复医学知识,而无需人工;(ii)多模式对齐:促进报告之间的语义对齐,疾病标签,和图像,我们明确地利用文本嵌入来指导视觉特征空间的学习。我们使用来自自然语言生成和临床疗效的度量标准对公共IU-X射线和MIMIC-CXR数据集进行评估。我们的消融研究表明,每个模块都有助于提高生成报告的质量。此外,这两个模块的协助,我们的方法在几乎所有指标上都优于最先进的方法。代码可在https://github.com/LX-doctorAI1/M2KT获得。
    In clinics, a radiology report is crucial for guiding a patient\'s treatment. However, writing radiology reports is a heavy burden for radiologists. To this end, we present an automatic, multi-modal approach for report generation from a chest x-ray. Our approach, motivated by the observation that the descriptions in radiology reports are highly correlated with specific information of the x-ray images, features two distinct modules: (i) Learned knowledge base: To absorb the knowledge embedded in the radiology reports, we build a knowledge base that can automatically distill and restore medical knowledge from textual embedding without manual labor; (ii) Multi-modal alignment: to promote the semantic alignment among reports, disease labels, and images, we explicitly utilize textual embedding to guide the learning of the visual feature space. We evaluate the performance of the proposed model using metrics from both natural language generation and clinic efficacy on the public IU-Xray and MIMIC-CXR datasets. Our ablation study shows that each module contributes to improving the quality of generated reports. Furthermore, the assistance of both modules, our approach outperforms state-of-the-art methods over almost all the metrics. Code is available at https://github.com/LX-doctorAI1/M2KT.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    安全问题一直是地铁建筑行业非常关注的问题。大量研究表明,安全问题与设计阶段密切相关。通过开发设计可以解决或改善许多安全问题。本研究提出了一种基于地铁设计规范的安全风险结构化识别方法,期刊文学,和专家经验。建立了设计的安全知识库(KB),以实现安全知识的共享和重用。该知识库已开发为建筑信息建模(BIM)软件,作为检查插件,以实现安全风险的自动分析和检索。为设计人员提供了风险组件的可视化,以定位和改进设计的预控制措施。随后,通过地铁站项目演示了安全设计(DFS)数据库创建的过程,验证了将KB应用于BIM安全检查的可行性。根据检查结果,通过标准化和改进设计,可以消除或避免施工阶段的安全风险。
    Safety issues have always been of great concern to the metro construction industry. Numerous studies have shown that safety issues are closely related to the design phase. Many safety problems can be solved or improved by developing the design. This study proposes a structured identification method for safety risks based on the metro design specifications, journal literature, and expert experience. A safety knowledge base (KB) for the design was established to realize safety knowledge sharing and reusing. The KB has been developed into Building Information Modeling (BIM) software as an inspection plug-in to achieve automated analysis and retrieval of safety risks. The designers are provided with a visualization of risk components to locate and improve the pre-control measures of the design. Subsequently, the process of design for safety (DFS) database creation was demonstrated with a metro station project, and the feasibility of applying the KB to safety checking in BIM was verified. In response to the inspection results, safety risks in the construction phases can be eliminated or avoided by standardizing and improving the design.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    结直肠癌(CRC)是一种异质性疾病,由于各种因素对靶向治疗的反应不同,治疗效果在个体之间差异显著。个性化医疗(PMT)是一种考虑患者个体特征的方法,使其成为解决这一问题的最有效方法。患者相似性和聚类分析是PMT的重要方面。本文介绍了如何使用形式概念分析(FCA)构建知识库,它根据患者的相似性对患者进行聚类,并以层次结构形式保留聚类之间的关系。
    2442例CRC患者的预后因素(属性),包括患者年龄,癌细胞分化,淋巴浸润和转移分期被用来建立FCA的正式背景。一个概念被定义为一组具有共同属性的患者。形式上下文是根据从数据集中识别的每个概念之间的相似性得分形成的,可以用作知识库。
    与诊断的CRC患者的临床记录一起构建了分层知识库。对于每个新病人,可以使用不同的相似度计算来检索与知识库中的每个现有概念的相似度得分。与概念相关联的排序的相似性得分可以为治疗计划提供参考。
    具有相同概念的患者表明来自相同临床程序或治疗的潜在相似效果。结合临床医生进行灵活分析和应用适当判断的能力,知识库可以为患者的治疗和护理做出更快,更有效的决策。
    Colorectal cancer (CRC) is a heterogeneous disease with different responses to targeted therapies due to various factors, and the treatment effect differs significantly between individuals. Personalize medical treatment (PMT) is a method that takes individual patient characteristics into consideration, making it the most effective way to deal with this issue. Patient similarity and clustering analysis is an important aspect of PMT. This paper describes how to build a knowledge base using formal concept analysis (FCA), which clusters patients based on their similarity and preserves the relations between clusters in hierarchical structural form.
    Prognostic factors (attributes) of 2442 CRC patients, including patient age, cancer cell differentiation, lymphatic invasion and metastasis stages were used to build a formal context in FCA. A concept was defined as a set of patients with their shared attributes. The formal context was formed based on the similarity scores between each concept identified from the dataset, which can be used as a knowledge base.
    A hierarchical knowledge base was constructed along with the clinical records of the diagnosed CRC patients. For each new patient, a similarity score to each existing concept in the knowledge base can be retrieved with different similarity calculations. The ranked similarity scores that are associated with the concepts can offer references for treatment plans.
    Patients that share the same concept indicates the potential similar effect from same clinical procedures or treatments. In conjunction with a clinician\'s ability to undergo flexible analyses and apply appropriate judgement, the knowledge base allows faster and more effective decisions to be made for patient treatment and care.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    针对现有分布式拒绝服务(DDoS)攻击检测目标单一的问题,共享数据集导致的不完整检测数据集和隐私,提出了一种基于联邦学习的可信多域DDoS检测方法。首先,我们将DDoS攻击的类型分为不同的子攻击,为每个领域的DDoS检测设计联合学习数据集,在保护各域数据隐私的前提下,实现更加全面的DDoS攻击检测方法。其次,为了提高联邦学习的鲁棒性和减轻中毒攻击,我们提出了一种基于区块链的声誉评估方法,估计互动声誉,综合每个参与者的数据信誉和资源信誉,从而获取可信的联邦学习参与者并识别恶意参与者。此外,提出了一种多域检测和分布式知识库的组合方案,并基于知识图设计了恶意行为特征图,实现了多域特征知识的记忆。实验结果表明,在保护数据集的情况下,多域DDoS检测方法的大多数类别的准确率都能达到95%以上,当阈值为0.6时,本文提出的信誉评估方法对恶意参与者的数据中毒攻击具有较高的识别能力。
    Aiming at the problems of single detection target of existing distributed denial of service (DDoS) attacks, incomplete detection datasets and privacy caused by shared datasets, we propose a trusted multi-domain DDoS detection method based on federated learning. Firstly, we divide the types of DDoS attacks into different sub-attacks, design the federated learning dataset for DDoS detection in each domain, and use them to realize a more comprehensive detection method of DDoS attacks on the premise of protecting the data privacy of each domain. Secondly, in order to improve the robustness of federated learning and alleviate poisoning attack, we propose a reputation evaluation method based on blockchain, which estimates interaction reputation, data reputation and resource reputation of each participant comprehensively, so as to obtain the trusted federated learning participants and identify the malicious participants. In addition, we also propose a combination scheme of multi-domain detection and distributed knowledge base and design a feature graph of malicious behavior based on a knowledge graph to realize the memory of multi-domain feature knowledge. The experimental results show that the accuracy of most categories of the multi-domain DDoS detection method can reach more than 95% with the protection of datasets, and the reputation evaluation method proposed in this paper has a higher ability to identify malicious participants against the data poisoning attack when the threshold is set to 0.6.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在神经科学领域,队列研究项目的核心包括收集,分析,和多模态数据共享。近年来,大量高效和高质量的工具包被发布和使用,以提高队列研究中多模态数据的质量。反过来,对于队列研究人员来说,从这样的研究中收集相关问题的答案是一项耗时的任务。作为我们解决这个问题的努力的一部分,我们提出了一个由项目/组织组成的分层神经科学知识库,多模式数据库,和工具包,以便于研究人员的答案搜索过程。我们首先根据多模态数据生命周期对“神经信息学前沿”主题进行的研究进行分类,从这些研究中,作为项目/组织的信息对象,多模式数据库,并提取了工具包。然后,我们将这些信息对象映射到我们提出的知识库框架中。还开发了一个基于Python的查询工具,以便更快地访问知识库,(可访问https://github.com/Romantic-Pumpkin/PDT_fninf)。最后,基于构建的知识库,我们讨论了多模态数据生命周期不同阶段的一些关键问题和潜在趋势。
    In the field of neuroscience, the core of the cohort study project consists of collection, analysis, and sharing of multi-modal data. Recent years have witnessed a host of efficient and high-quality toolkits published and employed to improve the quality of multi-modal data in the cohort study. In turn, gleaning answers to relevant questions from such a conglomeration of studies is a time-consuming task for cohort researchers. As part of our efforts to tackle this problem, we propose a hierarchical neuroscience knowledge base that consists of projects/organizations, multi-modal databases, and toolkits, so as to facilitate researchers\' answer searching process. We first classified studies conducted for the topic \"Frontiers in Neuroinformatics\" according to the multi-modal data life cycle, and from these studies, information objects as projects/organizations, multi-modal databases, and toolkits have been extracted. Then, we map these information objects into our proposed knowledge base framework. A Python-based query tool has also been developed in tandem for quicker access to the knowledge base, (accessible at https://github.com/Romantic-Pumpkin/PDT_fninf). Finally, based on the constructed knowledge base, we discussed some key research issues and underlying trends in different stages of the multi-modal data life cycle.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    神经退行性疾病(NDDs)是一系列慢性疾病,与神经元结构或功能的进行性丧失有关。NDD的复杂病因尚不清楚,因此NDD的预防和早期诊断对于降低这些疾病的死亡率和发病率至关重要。
    为了系统地了解与不同NDD(泛神经退行性疾病或泛NDD)相关的危险因素的异质性,建立知识库是为了促进个性化和知识引导的诊断,NDD的预防和预测。
    在收集数据之前,医学,咨询了生命科学和信息学专家以及数据库的潜在用户,并讨论了数据的范围和风险因素的分类。PubMed数据库被用作数据和知识提取的资源。NDD的危险因素是从1975年至2020年之间发表的文献中手动收集的。
    建立了NDD的综合风险因素数据库(NDDRF),包括998个单一或组合风险因素,与14个最常见的NDD相关的2293个记录和1071个文章。单一风险因素分为3类,即流行病学因素(469),遗传因素(324)和生化因素(153)。在所有因素中,179个因素是积极的和保护性的,而880个因素对NDD有负面影响。该知识库位于http://sysbio.org。cn/NDDRF/.
    NDDRF提供有关NDD风险因素的结构化信息和知识资源。它可能有利于未来泛NDDs发生和发展的系统和个性化研究。同时,它可用于未来可解释的人工智能建模,以智能诊断和预防NDD。
    Neurodegenerative diseases (NDDs) are a series of chronic diseases, which are associated with progressive loss of neuronal structure or function. The complex etiologies of the NDDs remain unclear, thus the prevention and early diagnosis of NDDs are critical to reducing the mortality and morbidity of these diseases.
    To provide a systematic understanding of the heterogeneity of the risk factors associated with different NDDs (pan-neurodegenerative diseases or pan-NDDs), the knowledgebase is established to facilitate the personalized and knowledge-guided diagnosis, prevention and prediction of NDDs.
    Before data collection, the medical, lifescienceand informatics experts as well as the potential users of the database were consulted and discussed for the scope of data and the classification of risk factors. The PubMed database was used as the resource of the data and knowledge extraction. Risk factors of NDDs were manually collected from literature published between 1975 and 2020.
    The comprehensive risk factors database for NDDs (NDDRF) was established including 998 single or combined risk factors, 2293 records and 1071 articles relevant to the 14 most common NDDs. The single risk factors are classified into 3 categories, i.e. epidemiological factors (469), genetic factors (324) and biochemical factors (153). Among all the factors, 179 factors are positive and protective, while 880 factors have negative influence for NDDs. The knowledgebase is available at http://sysbio.org.cn/NDDRF/.
    NDDRF provides the structured information and knowledge resource on risk factors of NDDs. It could benefit the future systematic and personalized investigation of pan-NDDs genesis and progression. Meanwhile it may be used for the future explainable artificial intelligence modeling for smart diagnosis and prevention of NDDs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    更新和专家质量的知识库是生物医学研究的基础。需要建立一个由人类参与并经过多次检查的知识库来支持临床决策,特别是在精确肿瘤学领域的发展。随着技术的进步和深入研究的发展,该领域的原始出版物数量急剧增加。因此,如何准确有效地收集和挖掘这些文章的问题现在需要密切考虑。在这项研究中,我们介绍OncoPubMiner(https://oncopubminer。chosenmedinfo.com),一个免费且功能强大的系统,结合了文本挖掘,数据结构定制,通过在线阅读和以项目为中心,以团队为基础的数据收集进行出版物搜索,以形成一站式\'关键词知识出\'肿瘤学出版物挖掘平台。该平台是通过集成PubMed的所有开放获取摘要和PubMedCentral的全文文章来构建的,它每天更新。OncoPubMiner可以直接从科学文章中获取精确的肿瘤学知识,并将帮助研究人员有效开发结构化的知识库系统,使我们更接近实现精确的肿瘤学目标。
    Updated and expert-quality knowledge bases are fundamental to biomedical research. A knowledge base established with human participation and subject to multiple inspections is needed to support clinical decision making, especially in the growing field of precision oncology. The number of original publications in this field has risen dramatically with the advances in technology and the evolution of in-depth research. Consequently, the issue of how to gather and mine these articles accurately and efficiently now requires close consideration. In this study, we present OncoPubMiner (https://oncopubminer.chosenmedinfo.com), a free and powerful system that combines text mining, data structure customisation, publication search with online reading and project-centred and team-based data collection to form a one-stop \'keyword in-knowledge out\' oncology publication mining platform. The platform was constructed by integrating all open-access abstracts from PubMed and full-text articles from PubMed Central, and it is updated daily. OncoPubMiner makes obtaining precision oncology knowledge from scientific articles straightforward and will assist researchers in efficiently developing structured knowledge base systems and bring us closer to achieving precision oncology goals.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号