关键词: Biomedical named entity recognition Deep learning Language model Multi-task learning

Mesh : Deep Learning Data Mining / methods Humans Natural Language Processing Neural Networks, Computer Language

来  源:   DOI:10.1016/j.ymeth.2024.04.013

Abstract:
Biomedical Named Entity Recognition (BioNER) is one of the most basic tasks in biomedical text mining, which aims to automatically identify and classify biomedical entities in text. Recently, deep learning-based methods have been applied to Biomedical Named Entity Recognition and have shown encouraging results. However, many biological entities are polysemous and ambiguous, which is one of the main obstacles to the task of biomedical named entity recognition. Deep learning methods require large amounts of training data, so the lack of data also affect the performance of model recognition. To solve the problem of polysemous words and insufficient data, for the task of biomedical named entity recognition, we propose a multi-task learning framework fused with language model based on the BiLSTM-CRF architecture. Our model uses a language model to design a differential encoding of the context, which could obtain dynamic word vectors to distinguish words in different datasets. Moreover, we use a multi-task learning method to collectively share the dynamic word vector of different types of entities to improve the recognition performance of each type of entity. Experimental results show that our model reduces the false positives caused by polysemous words through differentiated coding, and improves the performance of each subtask by sharing information between different entity data. Compared with other state-of-the art methods, our model achieved superior results in four typical training sets, and achieved the best results in F1 values.
摘要:
生物医学命名实体识别(BioNER)是生物医学文本挖掘中最基本的任务之一。它旨在自动识别和分类文本中的生物医学实体。最近,基于深度学习的方法已应用于生物医学命名实体识别,并显示出令人鼓舞的结果。然而,许多生物实体是多义性和模棱两可的,这是生物医学命名实体识别任务的主要障碍之一。深度学习方法需要大量的训练数据,数据的缺乏也影响了模型识别的性能。为了解决多义词和数据不足的问题,对于生物医学命名实体识别的任务,基于BiLSTM-CRF架构,提出了一种融合语言模型的多任务学习框架。我们的模型使用语言模型来设计上下文的差分编码,可以获得动态词向量来区分不同数据集中的词。此外,我们使用多任务学习方法共同共享不同类型实体的动态词向量,以提高每种类型实体的识别性能。实验结果表明,该模型通过区分编码减少了由多义词引起的误报,并通过在不同实体数据之间共享信息来提高每个子任务的性能。与其他最先进的方法相比,我们的模型在四个典型的训练集中取得了优异的结果,并在F1值中取得了最好的结果。
公众号