information extraction

信息抽取
  • 文章类型: Journal Article
    背景:临床自然语言处理和信息提取(IE)领域的快速技术进步导致了有关研究可比性和可复制性的挑战。
    目的:本文提出了一个报告指南,以规范临床文本中涉及IE研究的方法和结果的描述。
    方法:该指南是根据以前从包括34项研究在内的自由文本放射学报告中对IE进行范围审查的数据提取经验制定的。
    结果:该指南包括五个顶级类别信息模型,architecture,数据,注释,和结果。总的来说,我们定义了与这些类别相关的IE研究中要报告的28个方面。
    结论:拟议的指南有望为从临床文本中描述IE的研究制定报告标准,并促进整个研究领域的统一性。预期的未来技术进步可能会使指南有必要定期更新。在未来的研究中,我们计划开发一种分类法,明确定义相应的价值集,并通过遵循基于共识的方法将本指南和分类法整合起来.
    BACKGROUND: The rapid technical progress in the domain of clinical Natural Language Processing and information extraction (IE) has resulted in challenges concerning the comparability and replicability of studies.
    OBJECTIVE: This paper proposes a reporting guideline to standardize the description of methodologies and outcomes for studies involving IE from clinical texts.
    METHODS: The guideline is developed based on the experiences gained from data extraction for a previously conducted scoping review on IE from free-text radiology reports including 34 studies.
    RESULTS: The guideline comprises the five top-level categories information model, architecture, data, annotation, and outcomes. In total, we define 28 aspects to be reported on in IE studies related to these categories.
    CONCLUSIONS: The proposed guideline is expected to set a standard for reporting in studies describing IE from clinical text and promote uniformity across the research field. Expected future technological advancements may make regular updates of the guideline necessary. In future research, we plan to develop a taxonomy that clearly defines corresponding value sets as well as integrating both this guideline and the taxonomy by following a consensus-based methodology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:将这些预防指南与电子健康记录(EHRs)系统集成,加上个性化预防护理建议的产生,具有改善医疗保健结果的巨大潜力。我们的研究调查了使用大型语言模型(LLM)自动评估标准和风险因素的可行性,该指南用于未来对EHR医疗记录的分析。
    方法:我们注释了标准,危险因素,和美国预防服务工作组发布的成人指南中描述的预防性医疗服务,并评估了3种最新的LLM自动从指南中提取这些类别的信息。
    结果:我们在本研究中纳入了24条指南。LLM可以自动提取所有标准,危险因素,和9个指南的医疗服务。所有3个LLM在提取有关人口统计学标准或风险因素的信息方面表现良好。一些LLM在提取健康的社会决定因素方面表现更好,家族史,和预防性咨询服务比其他服务。
    结论:虽然LLM证明了处理冗长的预防性护理指南的能力,几个挑战依然存在,包括与输入令牌的最大长度和生成内容而不是严格遵守原始输入的趋势相关的约束。此外,在现实世界的临床环境中使用LLM需要仔细的伦理考虑。医疗保健专业人员必须仔细验证提取的信息,以减轻偏见,确保完整性,保持准确性。
    结论:我们开发了一种数据结构来存储注释的预防指南,并使其公开可用。采用最先进的LLM来提取预防性护理标准,危险因素,预防性护理服务为将来将这些指南纳入EHR铺平了道路。
    OBJECTIVE: The integration of these preventive guidelines with Electronic Health Records (EHRs) systems, coupled with the generation of personalized preventive care recommendations, holds significant potential for improving healthcare outcomes. Our study investigates the feasibility of using Large Language Models (LLMs) to automate the assessment criteria and risk factors from the guidelines for future analysis against medical records in EHR.
    METHODS: We annotated the criteria, risk factors, and preventive medical services described in the adult guidelines published by United States Preventive Services Taskforce and evaluated 3 state-of-the-art LLMs on extracting information in these categories from the guidelines automatically.
    RESULTS: We included 24 guidelines in this study. The LLMs can automate the extraction of all criteria, risk factors, and medical services from 9 guidelines. All 3 LLMs perform well on extracting information regarding the demographic criteria or risk factors. Some LLMs perform better on extracting the social determinants of health, family history, and preventive counseling services than the others.
    CONCLUSIONS: While LLMs demonstrate the capability to handle lengthy preventive care guidelines, several challenges persist, including constraints related to the maximum length of input tokens and the tendency to generate content rather than adhering strictly to the original input. Moreover, the utilization of LLMs in real-world clinical settings necessitates careful ethical consideration. It is imperative that healthcare professionals meticulously validate the extracted information to mitigate biases, ensure completeness, and maintain accuracy.
    CONCLUSIONS: We developed a data structure to store the annotated preventive guidelines and make it publicly available. Employing state-of-the-art LLMs to extract preventive care criteria, risk factors, and preventive care services paves the way for the future integration of these guidelines into the EHR.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Colorectal cancer is the most commonly occurring cancer in Germany, and the second and third most commonly diagnosed cancer in women and men, respectively. The therapy for this disease is based primarily on the tumor stages, which are usually documented in an unstructured form in medical information systems. In order to re-use this knowledge, the information must be extracted and annotated using the correct terminology.
    In this study, a natural language processing pipeline is developed to identify specific guideline-based patient information and to annotate it with Unified Medical Language System concepts for manual evaluation by a physician. The gold standard for one-time evaluation is determined using the human abstraction of 2513 German clinical notes from electronic health records.
    Using this approach to process the narrative clinical notes on colorectal cancer for retrospective evaluation of the therapy recommendation, the algorithm achieves a precision value of 96.64% for tumor stage detection and 97.95% for diagnosis recognition with recall values of 94.89% and 99.54%, respectively. The average precision value across all concepts relevant to treatment decisions for patients with known cancer diagnoses (11 concept groups) achieved a precision value of 82.05% with a recall value of 82.45% and an F1-score of 81.81%, respectively.
    The identification of guideline-based information from narrative clinical notes has the potential for implementation as clinical decision support tools.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • DOI:
    文章类型: Journal Article
    Clinical guidelines and clinical pathways are accepted and proven instruments for quality assurance and process optimization. Today, electronic representation of clinical guidelines exists as unstructured text, but is not well-integrated with patient-specific information from electronic health records. Consequently, generic content of the clinical guidelines is accessible, but it is not possible to visualize the position of the patient on the clinical pathway, decision support cannot be provided by personalized guidelines for the next treatment step. The Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) provides common reference terminology as well as the semantic link for combining the pathways and the patient-specific information. This paper proposes a model-based approach to support the development of guideline-compliant pathways combined with patient-specific structured and unstructured information using SNOMED CT. To identify SNOMED CT concepts, a software was developed to extract SNOMED CT codes out of structured and unstructured German data to map these with clinical pathways annotated in accordance with the systematized nomenclature.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号