deep learning models

  • 文章类型: Journal Article
    有毒重气二氧化硫(SO2)是一种特定的生命和环境危害。预测SO2的扩散已成为环境和安全研究等领域的研究热点。然而,传统方法,比如动力学模型,无法平衡精度和时间。因此,他们不符合紧急决策的需要。深度学习(DL)模型正在成为一种备受推崇的解决方案,提供更快,更准确的气体浓度预测。为此,本研究提出了一种创新的混合DL模型,并行连接卷积神经网络门控递归单元(PCCNN-GRU)。该模型利用两个CNN并联连接到过程气体释放和气象数据集,通过GRU实现高维数据特征的自动提取和长期时间依赖的处理。所提出的模型表现出良好的性能(RMSE,MAE,和R2分别为20.1658、10.9158和0.9288),并具有来自Prairie草项目(PPG)案例的实际数据。同时,为了解决原始数据可用性有限的问题,在这项研究中,首次将时间序列生成对抗网络(TimeGAN)引入SO2扩散研究,并验证了其有效性。增强研究的实用性,驱动因素对SO2扩散的贡献通过利用排列重要性(PIMP)和Sobol方法进行量化。此外,根据SO2毒性终点浓度可视化各种条件下的最大顺风安全距离。分析结果可为相关决策和措施提供科学依据。
    Toxic heavy gas sulfur dioxide (SO2) is a specific life and environmental hazard. Predicting the diffusion of SO2 has become a research focus in fields such as environmental and safety studies. However, traditional methods, such as kinetic models, cannot balance precision and time. Thus, they do not meet the needs of emergency decision-making. Deep learning (DL) models are emerging as a highly regarded solution, providing faster and more accurate predictions of gas concentrations. To this end, this study proposes an innovative hybrid DL model, the parallel-connected convolutional neural network-gated recurrent unit (PC CNN-GRU). This model utilizes two CNNs connected in parallel to process gas release and meteorological datasets, enabling the automatic extraction of high-dimensional data features and handling of long-term temporal dependencies through the GRU. The proposed model demonstrates good performance (RMSE, MAE, and R2 of 20.1658, 10.9158, and 0.9288, respectively) with real data from the Project Prairie Grass (PPG) case. Meanwhile, to address the issue of limited availability of raw data, in this study, time series generative adversarial network (TimeGAN) are introduced for SO2 diffusion studies for the first time, and their effectiveness is verified. To enhance the practicality of the research, the contribution of drivers to SO2 diffusion is quantified through the utilization of the permutation importance (PIMP) and Sobol\' method. Additionally, the maximum safe distance downwind under various conditions is visualized based on the SO2 toxicity endpoint concentration. The results of the analyses can provide a scientific basis for relevant decisions and measures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    深度学习模型是用于数据分析的工具,适用于近似变量之间的(非线性)关系,以便对结果进行最佳预测。虽然这些模型可以用来回答许多重要的问题,他们的效用仍然受到严厉批评,识别哪些数据描述符最适合表示给定的特定感兴趣现象是极具挑战性的。根据最近开发用于检测机械水表设备故障的深度学习模型的经验,我们已经了解到,如果一个人试图通过添加特定的设备描述符来训练深度学习模型,那么预测准确性可能会出现明显的下降。基于分类数据。这可能是因为数据维度的过度增加,具有相应的统计显著性损失。在使用替代方法进行了几次失败的实验之后,这些方法要么允许减少数据空间维度,要么采用更传统的机器学习算法。我们改变了训练策略,重新考虑分类数据,根据帕累托分析。实质上,我们使用了这些分类描述符,不是作为训练我们的深度学习模型的输入,但是作为一种为数据集提供新形状的工具,基于帕累托规则。有了这个数据调整,我们训练了一个性能更高的深度学习模型,能够检测有缺陷的水表设备,预测精度在87-90%之间,即使存在分类描述符。
    Deep learning models are tools for data analysis suitable for approximating (non-linear) relationships among variables for the best prediction of an outcome. While these models can be used to answer many important questions, their utility is still harshly criticized, being extremely challenging to identify which data descriptors are the most adequate to represent a given specific phenomenon of interest. With a recent experience in the development of a deep learning model designed to detect failures in mechanical water meter devices, we have learnt that a sensible deterioration of the prediction accuracy can occur if one tries to train a deep learning model by adding specific device descriptors, based on categorical data. This can happen because of an excessive increase in the dimensions of the data, with a correspondent loss of statistical significance. After several unsuccessful experiments conducted with alternative methodologies that either permit to reduce the data space dimensionality or employ more traditional machine learning algorithms, we changed the training strategy, reconsidering that categorical data, in the light of a Pareto analysis. In essence, we used those categorical descriptors, not as an input on which to train our deep learning model, but as a tool to give a new shape to the dataset, based on the Pareto rule. With this data adjustment, we trained a more performative deep learning model able to detect defective water meter devices with a prediction accuracy in the range 87-90%, even in the presence of categorical descriptors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号