关键词: BERT coreference resolution cross-entropy loss high-dimensional features multi-scale convolution natural language processing

来  源:   DOI:10.3390/e26060529   PDF(Pubmed)

Abstract:
Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding performance. Based on this, a multi-scale context information module is designed to improve the applicability of the BERT encoding model under different text spans. In addition, improving linear separability through dimension expansion. Finally, cross-entropy loss is used as the loss function. After adding BERT and span BERT to the module designed in this article, F1 increased by 0.5% and 0.2%, respectively.
摘要:
推理解决是自然语言处理中的关键任务。很难评估大跨度文本的相似性,这使得文本级编码有些挑战。本文首先比较了常用的方法来提高模型的全局信息收集能力对BERT编码性能的影响。基于此,为了提高BERT编码模型在不同文本跨度下的适用性,设计了多尺度上下文信息模块。此外,通过尺寸扩展提高线性可分性。最后,使用交叉熵损失作为损失函数。在本文设计的模块中添加BERT和spanBERT后,F1分别增加了0.5%和0.2%,分别。
公众号