关键词: clinical reasoning large language model pre-training prescription recommendation traditional Chinese medicine

Mesh : Medicine, Chinese Traditional Humans Clinical Reasoning Drugs, Chinese Herbal / therapeutic use Electronic Health Records Natural Language Processing Datasets as Topic

来  源:   DOI:10.1093/jamia/ocae087   PDF(Pubmed)

Abstract:
OBJECTIVE: The recent surge in large language models (LLMs) across various fields has yet to be fully realized in traditional Chinese medicine (TCM). This study aims to bridge this gap by developing a large language model tailored to TCM knowledge, enhancing its performance and accuracy in clinical reasoning tasks such as diagnosis, treatment, and prescription recommendations.
METHODS: This study harnessed a wide array of TCM data resources, including TCM ancient books, textbooks, and clinical data, to create 3 key datasets: the TCM Pre-trained Dataset, the Traditional Chinese Patent Medicine (TCPM) Question Answering Dataset, and the Spleen and Stomach Herbal Prescription Recommendation Dataset. These datasets underpinned the development of the Lingdan Pre-trained LLM and 2 specialized models: the Lingdan-TCPM-Chat Model, which uses a Chain-of-Thought process for symptom analysis and TCPM recommendation, and a Lingdan Prescription Recommendation model (Lingdan-PR) that proposes herbal prescriptions based on electronic medical records.
RESULTS: The Lingdan-TCPM-Chat and the Lingdan-PR Model, fine-tuned on the Lingdan Pre-trained LLM, demonstrated state-of-the art performances for the tasks of TCM clinical knowledge answering and herbal prescription recommendation. Notably, Lingdan-PR outperformed all state-of-the-art baseline models, achieving an improvement of 18.39% in the Top@20 F1-score compared with the best baseline.
CONCLUSIONS: This study marks a pivotal step in merging advanced LLMs with TCM, showcasing the potential of artificial intelligence to help improve clinical decision-making of medical diagnostics and treatment strategies. The success of the Lingdan Pre-trained LLM and its derivative models, Lingdan-TCPM-Chat and Lingdan-PR, not only revolutionizes TCM practices but also opens new avenues for the application of artificial intelligence in other specialized medical fields. Our project is available at https://github.com/TCMAI-BJTU/LingdanLLM.
摘要:
目的:最近在各个领域的大型语言模型(LLM)的激增尚未在中医(TCM)中得到充分实现。这项研究旨在通过开发适合中医知识的大型语言模型来弥合这一差距,提高其在诊断等临床推理任务中的性能和准确性,治疗,和处方建议。
方法:本研究利用了多种中医数据资源,包括中医古籍,教科书,和临床数据,创建3个关键数据集:TCM预训练数据集,中成药(TCPM)问答数据集,和脾胃草药处方推荐数据集。这些数据集支持了LingdanPre-trainedLLM和2个专门模型的开发:Lingdan-TCPM-Chat模型,它使用思想链过程进行症状分析和TCPM推荐,以及基于电子病历提出草药处方的灵丹处方推荐模型(灵丹-PR)。
结果:Lingdan-TCPM-Chat和Lingdan-PR模型,对Lingdan预培训LLM进行微调,展示了中医临床知识回答和草药处方推荐任务的最新表现。值得注意的是,Lingdan-PR的表现优于所有最先进的基线模型,与最佳基线相比,Top@20F1评分提高了18.39%。
结论:这项研究标志着将先进的LLM与TCM合并的关键一步,展示人工智能的潜力,以帮助改善医疗诊断和治疗策略的临床决策。灵丹预训练LLM及其衍生模型的成功,Lingdan-TCPM-Chat和Lingdan-PR,不仅彻底改变了中医实践,而且为人工智能在其他专业医学领域的应用开辟了新途径。我们的项目可在https://github.com/TCMAI-BJTU/LingdanLLM上获得。
公众号