deep learning models

  • 文章类型: Journal Article
    矿井水流入造成的灾害极大地威胁着煤矿开采作业的安全。深部开采使水文地质参数的获取复杂化,涌水的机制,以及矿井涌水量突变的预测。传统模型和单一机器学习方法通常无法准确预测矿井涌水量的突然变化。本研究引入了一种新颖的耦合分解-优化-深度学习模型,该模型集成了完整的经验模态分解与自适应噪声(CEEMDAN),北方苍鹰优化(NGO),和长短期记忆(LSTM)网络。我们评估了三种类型的矿井涌水量预测方法:奇异时间序列预测模型,分解-预测耦合模型,和分解-优化-预测耦合模型,评估他们捕捉数据趋势突然变化的能力及其预测准确性。结果表明,奇异预测模型是最优的,滑动输入步长为3,最大周期为400。与CEEMDAN-LSTM模型相比,CEEMDAN-NGO-LSTM模型在预测矿井涌水量的局部极端变化方面表现优异。具体来说,CEEMDAN-NGO-LSTM模型在MAE中获得96.578分,1.471%的MAPE,122.143inRMSE,和0.958的NSE,与LSTM模型和CEEMDAN-LSTM模型相比,平均性能提高了44.950%和19.400%,分别。此外,该模型提供了未来五天矿井涌水量的最准确预测。因此,分解-优化-预测耦合模型为智能矿山的安全监控提供了一种新颖的技术解决方案,为确保安全采矿作业提供了重要的理论和实践价值。
    Disasters caused by mine water inflows significantly threaten the safety of coal mining operations. Deep mining complicates the acquisition of hydrogeological parameters, the mechanics of water inrush, and the prediction of sudden changes in mine water inflow. Traditional models and singular machine learning approaches often fail to accurately forecast abrupt shifts in mine water inflows. This study introduces a novel coupled decomposition-optimization-deep learning model that integrates Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN), Northern Goshawk Optimization (NGO), and Long Short-Term Memory (LSTM) networks. We evaluate three types of mine water inflow forecasting methods: a singular time series prediction model, a decomposition-prediction coupled model, and a decomposition-optimization-prediction coupled model, assessing their ability to capture sudden changes in data trends and their prediction accuracy. Results show that the singular prediction model is optimal with a sliding input step of 3 and a maximum of 400 epochs. Compared to the CEEMDAN-LSTM model, the CEEMDAN-NGO-LSTM model demonstrates superior performance in predicting local extreme shifts in mine water inflow volumes. Specifically, the CEEMDAN-NGO-LSTM model achieves scores of 96.578 in MAE, 1.471% in MAPE, 122.143 in RMSE, and 0.958 in NSE, representing average performance improvements of 44.950% and 19.400% over the LSTM model and CEEMDAN-LSTM model, respectively. Additionally, this model provides the most accurate predictions of mine water inflow volumes over the next five days. Therefore, the decomposition-optimization-prediction coupled model presents a novel technical solution for the safety monitoring of smart mines, offering significant theoretical and practical value for ensuring safe mining operations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    深度学习模型为准确、稳定地预测河流水质提供了更为有力的方法,这对于水环境的智能管理和控制至关重要。为了提高水质参数预测的准确性,并基于深度学习模型更多地了解复杂空间信息的影响,本研究提出了基于季节和趋势分解(STL)方法的两种集成模型TNX(具有时间关注)和STNX(具有时空关注),以使用地感时间序列数据预测水质。溶解氧,总磷,和氨氮在短步骤(1小时,和2小时)和长步长(12小时,和24h)在一条河流中设有七个水质监测点。集成模型TNX相对于用于短步和长步水质预测的最佳基线深度学习模型,性能提高了2.1%-6.1%和4.3%-22.0%,只需预测STL分解后原始数据的趋势分量,就能捕捉到水质参数的变化规律。STNX模型,有了时空注意力,与TNX模型相比,短步和长步水质预测的性能提高了0.5%-2.4%和2.3%-5.7%,这种改进更有效地减轻了长步预测的预测偏移模式。此外,模型解释结果一致地显示了所有监测点的正相关模式.然而,七个特定监测点的重要性随着预测监测点和输入监测点之间距离的增加而减弱。本研究为改善河流水质参数的短步和长步预测提供了一种基于STL分解的集成建模方法。了解复杂空间信息对深度学习模型的影响。
    Deep learning models provide a more powerful method for accurate and stable prediction of water quality in rivers, which is crucial for the intelligent management and control of the water environment. To increase the accuracy of predicting the water quality parameters and learn more about the impact of complex spatial information based on deep learning models, this study proposes two ensemble models TNX (with temporal attention) and STNX (with spatio-temporal attention) based on seasonal and trend decomposition (STL) method to predict water quality using geo-sensory time series data. Dissolved oxygen, total phosphorus, and ammonia nitrogen were predicted in short-step (1 h, and 2 h) and long-step (12 h, and 24 h) with seven water quality monitoring sites in a river. The ensemble model TNX improved the performance by 2.1%-6.1% and 4.3%-22.0% relative to the best baseline deep learning model for the short-step and long-step water quality prediction, and it can capture the variation pattern of water quality parameters by only predicting the trend component of raw data after STL decomposition. The STNX model, with spatio-temporal attention, obtained 0.5%-2.4% and 2.3%-5.7% higher performance compared to the TNX model for the short-step and long-step water quality prediction, and such improvement was more effective in mitigating the prediction shift patterns of long-step prediction. Moreover, the model interpretation results consistently demonstrated positive relationship patterns across all monitoring sites. However, the significance of seven specific monitoring sites diminished as the distance between the predicted and input monitoring sites increased. This study provides an ensemble modeling approach based on STL decomposition for improving short-step and long-step prediction of river water quality parameter, and understands the impact of complex spatial information on deep learning model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    多重压力暴露下茶叶生产的主要挑战对其全球市场可持续性产生了负面影响。因此,引入一种内场快速技术来监测茶叶的压力具有巨大的迫切需求。因此,这项研究旨在提出一种基于具有深度学习模型的便携式智能手机检测压力症状的有效方法。首先,开发了一个数据库,其中包含10,000多个复杂自然场景中的茶园树冠图像,其中包括健康(无压力)和三种类型的压力(茶炭疽病(TA),茶泡枯萎病(TB)和晒伤(SB))。然后,YOLOv5m和YOLOv8m算法适用于区分四种类型的压力症状;其中YOLOv8m算法在识别健康叶子方面取得了更好的性能(98%),TA(92.0%),TB(68.4%)和SB(75.5%)。此外,YOLOv8m算法用于构建TA疾病严重程度的鉴别模型,并取得了满意的结果,中度,严重的TA感染占94%,96%,91%,分别。此外,我们发现YOLOv8m的CNN内核可以有效地提取第2层图像的纹理特征,并且这些特征可以清楚地区分不同类型的压力症状。这对YOLOv8m模型实现四类应激症状的高精度区分做出了巨大贡献。总之,我们的研究提供了一个有效的系统来实现低成本,高精度,快,基于智能手机和深度学习算法的复杂自然场景下茶应激症状的现场诊断。
    The primary challenges in tea production under multiple stress exposures have negatively affected its global market sustainability, so introducing an infield fast technique for monitoring tea leaves\' stresses has tremendous urgent needs. Therefore, this study aimed to propose an efficient method for the detection of stress symptoms based on a portable smartphone with deep learning models. Firstly, a database containing over 10,000 images of tea garden canopies in complex natural scenes was developed, which included healthy (no stress) and three types of stress (tea anthracnose (TA), tea blister blight (TB) and sunburn (SB)). Then, YOLOv5m and YOLOv8m algorithms were adapted to discriminate the four types of stress symptoms; where the YOLOv8m algorithm achieved better performance in the identification of healthy leaves (98%), TA (92.0%), TB (68.4%) and SB (75.5%). Furthermore, the YOLOv8m algorithm was used to construct a model for differentiation of disease severity of TA, and a satisfactory result was obtained with the accuracy of mild, moderate, and severe TA infections were 94%, 96%, and 91%, respectively. Besides, we found that CNN kernels of YOLOv8m could efficiently extract the texture characteristics of the images at layer 2, and these characteristics can clearly distinguish different types of stress symptoms. This makes great contributions to the YOLOv8m model to achieve high-precision differentiation of four types of stress symptoms. In conclusion, our study provided an effective system to achieve low-cost, high-precision, fast, and infield diagnosis of tea stress symptoms in complex natural scenes based on smartphone and deep learning algorithms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    确保化合物的安全性和有效性在小分子药物开发中至关重要。在药物开发的后期,有毒化合物构成了重大挑战,失去宝贵的资源和时间。使用深度学习模型对化合物毒性的早期和准确预测提供了一种有前途的解决方案,可以在药物发现期间减轻这些风险。在这项研究中,我们介绍了几种旨在评估不同类型化合物毒性的深度学习模型的发展,包括急性毒性,致癌性,hERG_心脏毒性(人类ether-a-go-go相关基因引起的心脏毒性),肝毒性,和诱变性。为了解决数据大小的固有变化,标签类型,以及在不同类型的毒性中的分布,我们采用了不同的培训策略。我们的第一种方法涉及利用图卷积网络(GCN)回归模型来预测急性毒性,在腹膜内用PearsonR0.76、0.74和0.65取得了显著的性能,静脉注射,和口服给药途径,分别。此外,我们训练了多个GCN二元分类模型,每种都适合特定类型的毒性。这些模型表现出很高的曲线下面积(AUC)得分,预测致癌性的AUC为0.69、0.77、0.88和0.79,hERG_心脏毒性,致突变性,和肝毒性,分别。此外,我们使用批准的药物数据集来确定模型使用预测评分的适当阈值.我们将这些模型整合到虚拟筛选管道中,以评估其在识别潜在低毒候选药物方面的有效性。我们的研究结果表明,这种深度学习方法有可能通过加快选择低毒性化合物来显著降低与药物开发相关的成本和风险。因此,本研究开发的模型有望成为早期候选药物筛选和选择的关键工具.
    Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Multicenter Study
    背景:慢性阻塞性肺疾病(COPD)在当前的金标准测量肺功能测试(PFT)中被低估。对于COPD的早期检测和严重程度评估,一个更敏感和简单的选择可以使医生和患者受益。
    方法:在这项多中心回顾性研究中,收集和处理1055名参与者的正面胸部X线(CXR)图像和相关临床信息。对不同的深度学习算法和迁移学习模型进行了训练,以根据666名受试者的临床数据和CXR图像对COPD进行分类,并在基于284名参与者的内部测试集中进行了验证。还进行了包括105名参与者的外部测试,以验证学习算法在诊断COPD中的泛化能力。同时,该模型被进一步用于通过预测不同分级来评估COPD的疾病严重程度.
    结果:Ensemble模型通过在内部测试中同时提取临床参数和CXR图像的融合特征,在区分COPD方面显示AUC为0.969,优于仅使用临床参数(AUC=0.963)或图像(AUC=0.946)的模型。对于外部测试集,在根据临床参数和CXR图像预测COPD时,AUC略微下降至0.934.当应用Ensemble模型来确定COPD的疾病严重程度时,三分类和五分类的AUC分别达到0.894和0.852。
    结论:本研究使用DL算法筛查COPD并根据CXR成像和临床参数预测疾病严重程度。模型表现出良好的性能,该方法可能是一种有效的病例发现工具,辐射剂量低,用于COPD的诊断和分期。
    BACKGROUND: Chronic obstructive pulmonary disease (COPD) is underdiagnosed with the current gold standard measure pulmonary function test (PFT). A more sensitive and simple option for early detection and severity evaluation of COPD could benefit practitioners and patients.
    METHODS: In this multicenter retrospective study, frontal chest X-ray (CXR) images and related clinical information of 1055 participants were collected and processed. Different deep learning algorithms and transfer learning models were trained to classify COPD based on clinical data and CXR images from 666 subjects, and validated in internal test set based on 284 participants. External test including 105 participants was also performed to verify the generalization ability of the learning algorithms in diagnosing COPD. Meanwhile, the model was further used to evaluate disease severity of COPD by predicting different grads.
    RESULTS: The Ensemble model showed an AUC of 0.969 in distinguishing COPD by simultaneously extracting fusion features of clinical parameters and CXR images in internal test, better than models that used clinical parameters (AUC = 0.963) or images (AUC = 0.946) only. For the external test set, the AUC slightly declined to 0.934 in predicting COPD based on clinical parameters and CXR images. When applying the Ensemble model to determine disease severity of COPD, the AUC reached 0.894 for three-classification and 0.852 for five-classification respectively.
    CONCLUSIONS: The present study used DL algorithms to screen COPD and predict disease severity based on CXR imaging and clinical parameters. The models showed good performance and the approach might be an effective case-finding tool with low radiation dose for COPD diagnosis and staging.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    彩色人脸图像通常通过公共频道传输,他们容易受到篡改攻击。为了解决这个问题,本文介绍了一种新颖的方案,称为认证和彩色人脸自恢复(AuCFSR),以确保彩色人脸图像的真实性并恢复这些图像中的篡改区域。AuCFSR使用称为二维模块化正弦余弦映射(2DMSCM)的新二维超混沌系统,将身份验证和恢复数据嵌入彩色图像像素的最低有效位。这产生具有高安全级别的高质量输出图像。当检测到篡改的彩色人脸图像时,AuCFSR执行两个深度学习模型:用于增强恢复的彩色人脸图像的视觉质量的CodeFormer模型和用于改善该图像的着色的DeOldify模型。实验结果表明,AuCFSR在篡改检测精度方面优于最近的类似方案,安全级别,和恢复图像的视觉质量。
    Color face images are often transmitted over public channels, where they are vulnerable to tampering attacks. To address this problem, the present paper introduces a novel scheme called Authentication and Color Face Self-Recovery (AuCFSR) for ensuring the authenticity of color face images and recovering the tampered areas in these images. AuCFSR uses a new two-dimensional hyperchaotic system called two-dimensional modular sine-cosine map (2D MSCM) to embed authentication and recovery data into the least significant bits of color image pixels. This produces high-quality output images with high security level. When tampered color face image is detected, AuCFSR executes two deep learning models: the CodeFormer model to enhance the visual quality of the recovered color face image and the DeOldify model to improve the colorization of this image. Experimental results demonstrate that AuCFSR outperforms recent similar schemes in tamper detection accuracy, security level, and visual quality of the recovered images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    识别具有高驾驶员级别风险的关键安全管理驾驶员对于改善交通安全至关重要。以前的研究通常根据汇总的统计特征评估驾驶员级别的风险(例如,驾驶暴露和驾驶行为),这些数据是从长周期驾驶监测数据中获得的。然而,鉴于联网车辆和车载数据仪器技术的巨大进步,短期驾驶数据的收集显着增加,已成为分析的重要数据来源。在这个数据环境中,由于驾驶行为的时变特征以及不足的数据采样周期,传统上采用的聚合行为特性是不稳定的。因此,传统的基于聚合统计特征的建模方法已不再可行。而不是利用这种不可靠的统计信息来表示驾驶员级别的风险,这项研究利用驾驶行为的时间变化特征来识别短期驾驶数据环境中的关键安全管理驾驶员。具体来说,建立了驾驶行为时间变化特征与个体碰撞发生概率之间的关系。为了消除驾驶员驾驶行为异质性对模型性能的影响,提出了可以量化驾驶行为异常程度的“交通熵”指标。采用卷积神经网络(CNN)和长短期记忆(LSTM)的深度学习模型进行时间变化特征挖掘。使用从在线乘车服务获得的数据进行了实证分析。实验结果表明,基于时间变化特征的模型优于传统的基于聚合统计特征的模型。曲线下面积(AUC)指数提高4.1%。提出的流量熵指数进一步增强了5.3%的模型性能。最佳模型实现了0.754的AUC,与利用长周期驾驶数据的现有方法相当。最后,讨论了该方法在驾驶员管理程序开发中的应用及其进一步研究。
    Identifying critical safety management drivers with high driver-level risks is essential for traffic safety improvement. Previous studies commonly evaluated driver-level risks based upon aggregated statistical characteristics (e.g., driving exposure and driving behavior), which were obtained from long-period driving monitoring data. However, given the great advancements of the connected vehicle and in-vehicle data instrumentation technologies, there has been a notable increase in the collection of short-period driving data, which has emerged as a prominent data source for analysis. In this data environment, traditionally employed aggregated behavior characteristics are unstable due to the time-varying feature of driving behavior coupled with insufficient data sampling periods. Thus, traditional modeling methods based upon aggregated statistical characteristics are no longer feasible. Instead of utilizing such unreliable statistical information to represent driver-level risks, this study employed temporal variation characteristics of driving behavior to identify critical safety management drivers in the short-period driving data environment. Specifically, the relationships between driving behavior temporal variation characteristics and individual crash occurrence probability were developed. To eliminate the impacts of drivers\' driving behavior heterogeneity on model performance, \"traffic entropy\" index that could quantify the abnormal degrees of driving behavior was proposed. Deep learning models including convolutional neural network (CNN) and long short-term memory (LSTM) were employed to conduct the temporal variation feature mining. Empirical analyses were conducted using data obtained from online ride-hailing services. Experiment results showed that temporal variation characteristics based models outperformed traditional aggregated statistical characteristics based models. The area under the curve (AUC) index was improved by 4.1%. And the proposed traffic entropy index further enhanced the model performance by 5.3%. The best model achieved an AUC of 0.754, comparable to existing approaches utilizing long-period driving data. Finally, applications of the proposed method in driver management program development and its further investigations have been discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在高光谱图像分类领域,追求更高的准确性和全面的特征提取导致了先进的建筑范式的形成。本研究提出了一个封装在统一模型框架内的模型,它协同利用了三个不同分支的功能:swin变压器,卷积神经网络,和编码器-解码器。主要目标是促进多尺度特征学习,高光谱图像分类中的一个关键方面,每个分支都专注于多尺度特征提取的独特方面。swin变压器,以其提炼远程依赖关系的能力而闻名,捕捉不同尺度的结构特征;同时,卷积神经网络进行局部特征提取,产生细致入微的空间信息保存。编码器-解码器分支进行综合分析和重建,促进多尺度光谱和空间复杂性的同化。为了评估我们的方法,我们在公开的数据集上进行了实验,并将结果与最先进的方法进行了比较.与其他模型相比,我们提出的模型获得了最好的分类结果。具体来说,总体准确率为96.87%,98.48%,在徐州获得98.62%,萨利纳斯,和LK数据集。
    In the realm of hyperspectral image classification, the pursuit of heightened accuracy and comprehensive feature extraction has led to the formulation of an advance architectural paradigm. This study proposed a model encapsulated within the framework of a unified model, which synergistically leverages the capabilities of three distinct branches: the swin transformer, convolutional neural network, and encoder-decoder. The main objective was to facilitate multiscale feature learning, a pivotal facet in hyperspectral image classification, with each branch specializing in unique facets of multiscale feature extraction. The swin transformer, recognized for its competence in distilling long-range dependencies, captures structural features across different scales; simultaneously, convolutional neural networks undertake localized feature extraction, engendering nuanced spatial information preservation. The encoder-decoder branch undertakes comprehensive analysis and reconstruction, fostering the assimilation of both multiscale spectral and spatial intricacies. To evaluate our approach, we conducted experiments on publicly available datasets and compared the results with state-of-the-art methods. Our proposed model obtains the best classification result compared to others. Specifically, overall accuracies of 96.87%, 98.48%, and 98.62% were obtained on the Xuzhou, Salinas, and LK datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    有毒重气二氧化硫(SO2)是一种特定的生命和环境危害。预测SO2的扩散已成为环境和安全研究等领域的研究热点。然而,传统方法,比如动力学模型,无法平衡精度和时间。因此,他们不符合紧急决策的需要。深度学习(DL)模型正在成为一种备受推崇的解决方案,提供更快,更准确的气体浓度预测。为此,本研究提出了一种创新的混合DL模型,并行连接卷积神经网络门控递归单元(PCCNN-GRU)。该模型利用两个CNN并联连接到过程气体释放和气象数据集,通过GRU实现高维数据特征的自动提取和长期时间依赖的处理。所提出的模型表现出良好的性能(RMSE,MAE,和R2分别为20.1658、10.9158和0.9288),并具有来自Prairie草项目(PPG)案例的实际数据。同时,为了解决原始数据可用性有限的问题,在这项研究中,首次将时间序列生成对抗网络(TimeGAN)引入SO2扩散研究,并验证了其有效性。增强研究的实用性,驱动因素对SO2扩散的贡献通过利用排列重要性(PIMP)和Sobol方法进行量化。此外,根据SO2毒性终点浓度可视化各种条件下的最大顺风安全距离。分析结果可为相关决策和措施提供科学依据。
    Toxic heavy gas sulfur dioxide (SO2) is a specific life and environmental hazard. Predicting the diffusion of SO2 has become a research focus in fields such as environmental and safety studies. However, traditional methods, such as kinetic models, cannot balance precision and time. Thus, they do not meet the needs of emergency decision-making. Deep learning (DL) models are emerging as a highly regarded solution, providing faster and more accurate predictions of gas concentrations. To this end, this study proposes an innovative hybrid DL model, the parallel-connected convolutional neural network-gated recurrent unit (PC CNN-GRU). This model utilizes two CNNs connected in parallel to process gas release and meteorological datasets, enabling the automatic extraction of high-dimensional data features and handling of long-term temporal dependencies through the GRU. The proposed model demonstrates good performance (RMSE, MAE, and R2 of 20.1658, 10.9158, and 0.9288, respectively) with real data from the Project Prairie Grass (PPG) case. Meanwhile, to address the issue of limited availability of raw data, in this study, time series generative adversarial network (TimeGAN) are introduced for SO2 diffusion studies for the first time, and their effectiveness is verified. To enhance the practicality of the research, the contribution of drivers to SO2 diffusion is quantified through the utilization of the permutation importance (PIMP) and Sobol\' method. Additionally, the maximum safe distance downwind under various conditions is visualized based on the SO2 toxicity endpoint concentration. The results of the analyses can provide a scientific basis for relevant decisions and measures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    《伤寒论》是我国中药学术史上的重要经典著作。基于《伤寒论》研究建立的中医知识图谱,建立了中医问答系统,以帮助人们更好地理解和使用中医。意向分类是中医问答系统的基础,但就我们所知,目前还没有基于“伤寒论”的问题意向分类研究。在本文中,意图分类研究是以《伤寒论》中与中药相关的内容材料为数据进行的。大多数现有模型在长文本分类任务上表现良好,高成本和大量的内存需求。然而,本文的意图分类数据具有短文本的特点,少量的数据,不平衡的类别。针对这些问题,本文提出了一种结合卷积神经网络模型的基于知识精馏的双向变换器编码器(TinyBERT-CNN),用于“伤寒论”中的问题意图分类任务。该模型采用TinyBERT作为嵌入和编码层,获取文本的全局矢量信息,然后将编码后的特征信息输入到CNN中,完成意图分类。实验结果表明,该模型在准确性方面优于其他模型,召回,F1值为96.4%,95.9%,96.2%,分别。实验结果证明,本文提出的模型能够有效地对《伤寒论》中的问句进行意图分类,并为以后的《伤寒论》问答系统提供技术支持。
    \"Treatise on Febrile Diseases\" is an important classic book in the academic history of Chinese material medica. Based on the knowledge map of traditional Chinese medicine established by the study of \"Treatise on Febrile Diseases\", a question-answering system of traditional Chinese medicine was established to help people better understand and use traditional Chinese medicine. Intention classification is the basis of the question-answering system of traditional Chinese medicine, but as far as we know, there is no research on question intention classification based on \"Treatise on Febrile Diseases\". In this paper, the intent classification research is carried out based on the Chinese material medica-related content materials in \"Treatise on Febrile Diseases\" as data. Most of the existing models perform well on long text classification tasks, with high costs and a lot of memory requirements. However, the intent classification data of this paper has the characteristics of short text, a small amount of data, and unbalanced categories. In response to these problems, this paper proposes a knowledge distillation-based bidirectional Transformer encoder combined with a convolutional neural network model (TinyBERT-CNN), which is used for the task of question intent classification in \"Treatise on Febrile Diseases\". The model used TinyBERT as an embedding and encoding layer to obtain the global vector information of the text and then completed the intent classification by feeding the encoded feature information into the CNN. The experimental results indicated that the model outperformed other models in terms of accuracy, recall, and F1 values of 96.4%, 95.9%, and 96.2%, respectively. The experimental results prove that the model proposed in this paper can effectively classify the intent of the question sentences in \"Treatise on Febrile Diseases\", and provide technical support for the question-answering system of \"Treatise on Febrile Diseases\" later.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号