关键词: Data mining Deep learning models Intent classification Short text Typhoid theory

Mesh : Humans Intention Neural Networks, Computer Language

来  源:   DOI:10.1016/j.compbiomed.2023.107075

Abstract:
\"Treatise on Febrile Diseases\" is an important classic book in the academic history of Chinese material medica. Based on the knowledge map of traditional Chinese medicine established by the study of \"Treatise on Febrile Diseases\", a question-answering system of traditional Chinese medicine was established to help people better understand and use traditional Chinese medicine. Intention classification is the basis of the question-answering system of traditional Chinese medicine, but as far as we know, there is no research on question intention classification based on \"Treatise on Febrile Diseases\". In this paper, the intent classification research is carried out based on the Chinese material medica-related content materials in \"Treatise on Febrile Diseases\" as data. Most of the existing models perform well on long text classification tasks, with high costs and a lot of memory requirements. However, the intent classification data of this paper has the characteristics of short text, a small amount of data, and unbalanced categories. In response to these problems, this paper proposes a knowledge distillation-based bidirectional Transformer encoder combined with a convolutional neural network model (TinyBERT-CNN), which is used for the task of question intent classification in \"Treatise on Febrile Diseases\". The model used TinyBERT as an embedding and encoding layer to obtain the global vector information of the text and then completed the intent classification by feeding the encoded feature information into the CNN. The experimental results indicated that the model outperformed other models in terms of accuracy, recall, and F1 values of 96.4%, 95.9%, and 96.2%, respectively. The experimental results prove that the model proposed in this paper can effectively classify the intent of the question sentences in \"Treatise on Febrile Diseases\", and provide technical support for the question-answering system of \"Treatise on Febrile Diseases\" later.
摘要:
《伤寒论》是我国中药学术史上的重要经典著作。基于《伤寒论》研究建立的中医知识图谱,建立了中医问答系统,以帮助人们更好地理解和使用中医。意向分类是中医问答系统的基础,但就我们所知,目前还没有基于“伤寒论”的问题意向分类研究。在本文中,意图分类研究是以《伤寒论》中与中药相关的内容材料为数据进行的。大多数现有模型在长文本分类任务上表现良好,高成本和大量的内存需求。然而,本文的意图分类数据具有短文本的特点,少量的数据,不平衡的类别。针对这些问题,本文提出了一种结合卷积神经网络模型的基于知识精馏的双向变换器编码器(TinyBERT-CNN),用于“伤寒论”中的问题意图分类任务。该模型采用TinyBERT作为嵌入和编码层,获取文本的全局矢量信息,然后将编码后的特征信息输入到CNN中,完成意图分类。实验结果表明,该模型在准确性方面优于其他模型,召回,F1值为96.4%,95.9%,96.2%,分别。实验结果证明,本文提出的模型能够有效地对《伤寒论》中的问句进行意图分类,并为以后的《伤寒论》问答系统提供技术支持。
公众号