医学中的预训练语言模型：一项调查。Pre-trained language models in medicine: A survey.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

With the rapid progress in Natural Language Processing (NLP), Pre-trained Language Models (PLM) such as BERT, BioBERT, and ChatGPT have shown great potential in various medical NLP tasks. This paper surveys the cutting-edge achievements in applying PLMs to various medical NLP tasks. Specifically, we first brief PLMS and outline the research of PLMs in medicine. Next, we categorise and discuss the types of tasks in medical NLP, covering text summarisation, question-answering, machine translation, sentiment analysis, named entity recognition, information extraction, medical education, relation extraction, and text mining. For each type of task, we first provide an overview of the basic concepts, the main methodologies, the advantages of applying PLMs, the basic steps of applying PLMs application, the datasets for training and testing, and the metrics for task evaluation. Subsequently, a summary of recent important research findings is presented, analysing their motivations, strengths vs weaknesses, similarities vs differences, and discussing potential limitations. Also, we assess the quality and influence of the research reviewed in this paper by comparing the citation count of the papers reviewed and the reputation and impact of the conferences and journals where they are published. Through these indicators, we further identify the most concerned research topics currently. Finally, we look forward to future research directions, including enhancing models\' reliability, explainability, and fairness, to promote the application of PLMs in clinical practice. In addition, this survey also collect some download links of some model codes and the relevant datasets, which are valuable references for researchers applying NLP techniques in medicine and medical professionals seeking to enhance their expertise and healthcare service through AI technology.

摘要：

随着自然语言处理(NLP)的快速发展,预训练语言模型(PLM)，如BERT、Biobert,ChatGPT在各种医学NLP任务中显示出巨大的潜力。本文调查了将PLM应用于各种医学NLP任务的前沿成就。具体来说,我们首先简要介绍PLMS，概述PLMS在医学中的研究。接下来,我们对医学NLP中的任务类型进行分类和讨论，涵盖文本摘要，问答,机器翻译,情绪分析,命名实体识别，信息提取,医学教育,关系提取，和文本挖掘。对于每种类型的任务，我们首先提供基本概念的概述，主要方法，应用PLM的优势，应用PLM应用程序的基本步骤，用于培训和测试的数据集，以及任务评估的指标。随后，总结了最近的重要研究成果，分析他们的动机，优势与劣势，相似性与差异性，讨论潜在的限制。此外，我们通过比较被审查论文的引文数和发表论文的会议和期刊的声誉和影响来评估本文所审查研究的质量和影响力。通过这些指标,我们进一步确定了当前最关注的研究课题。最后,我们期待着未来的研究方向，包括增强模型的可靠性，可解释性,和公平,促进PLMs在临床实践中的应用。此外,本次调查还收集了一些模型代码和相关数据集的下载链接，这对于在医学中应用NLP技术的研究人员和寻求通过AI技术增强其专业知识和医疗保健服务的医疗专业人员来说是有价值的参考。