Mesh : Humans Information Storage and Retrieval / methods Natural Language Processing

来  源:   DOI:10.1093/bioinformatics/btae238   PDF(Pubmed)

Abstract:
CONCLUSIONS: Recent proprietary large language models (LLMs), such as GPT-4, have achieved a milestone in tackling diverse challenges in the biomedical domain, ranging from multiple-choice questions to long-form generations. To address challenges that still cannot be handled with the encoded knowledge of LLMs, various retrieval-augmented generation (RAG) methods have been developed by searching documents from the knowledge corpus and appending them unconditionally or selectively to the input of LLMs for generation. However, when applying existing methods to different domain-specific problems, poor generalization becomes apparent, leading to fetching incorrect documents or making inaccurate judgments. In this paper, we introduce Self-BioRAG, a framework reliable for biomedical text that specializes in generating explanations, retrieving domain-specific documents, and self-reflecting generated responses. We utilize 84k filtered biomedical instruction sets to train Self-BioRAG that can assess its generated explanations with customized reflective tokens. Our work proves that domain-specific components, such as a retriever, domain-related document corpus, and instruction sets are necessary for adhering to domain-related instructions. Using three major medical question-answering benchmark datasets, experimental results of Self-BioRAG demonstrate significant performance gains by achieving a 7.2% absolute improvement on average over the state-of-the-art open-foundation model with a parameter size of 7B or less. Similarly, Self-BioRAG outperforms RAG by 8% Rouge-1 score in generating more proficient answers on two long-form question-answering benchmarks on average. Overall, we analyze that Self-BioRAG finds the clues in the question, retrieves relevant documents if needed, and understands how to answer with information from retrieved documents and encoded knowledge as a medical expert does. We release our data and code for training our framework components and model weights (7B and 13B) to enhance capabilities in biomedical and clinical domains.
METHODS: Self-BioRAG is available at https://github.com/dmis-lab/self-biorag.
摘要:
结论:最近的专有大型语言模型(LLM),例如GPT-4,在应对生物医学领域的各种挑战方面取得了里程碑,从多项选择题到长辈。为了解决使用LLM的编码知识仍然无法处理的挑战,通过从知识语料库中搜索文档并无条件或选择性地将其附加到LLM的输入以进行生成,已经开发了各种检索增强生成(RAG)方法。然而,当将现有方法应用于不同的领域特定问题时,糟糕的泛化变得显而易见,导致获取不正确的文件或做出不准确的判断。在本文中,我们介绍Self-BioRAG,一个可靠的生物医学文本框架,专门用于生成解释,检索特定于域的文档,和自我反思产生的反应。我们利用84k过滤的生物医学指令集来训练Self-BioRAG,它可以使用定制的反射标记来评估其生成的解释。我们的工作证明了特定领域的组件,比如猎犬,领域相关文档语料库,和指令集对于遵守域相关指令是必要的。使用三个主要的医学问答基准数据集,Self-BioRAG的实验结果表明,在参数大小为7B或更小的最先进的开放基础模型上,平均实现了7.2%的绝对改进。同样,Self-BioRAG平均在两个长型问答基准上产生更熟练的答案,比RAG高出8%的Rouge-1分数。总的来说,我们分析Self-BioRAG在问题中找到了线索,如果需要,检索相关文件,并了解如何像医学专家那样使用检索到的文档和编码知识中的信息进行回答。我们发布了用于训练框架组件和模型权重(7B和13B)的数据和代码,以增强生物医学和临床领域的能力。
方法:Self-BioRAG可在https://github.com/dmis-lab/self-biorag获得。
公众号