关键词: Causal inference Copula function Deep learning algorithms Time-series forecasting Water resources management

Mesh : Water Quality Uncertainty Bayes Theorem Reproducibility of Results Algorithms Forecasting

来  源:   DOI:10.1016/j.jenvman.2023.119613

Abstract:
Accurate forecasting of water quality variables in river systems is crucial for relevant administrators to identify potential water quality degradation issues and take countermeasures promptly. However, pure data-driven forecasting models are often insufficient to deal with the highly varying periodicity of water quality in today\'s more complex environment. This study presents a new holistic framework for time-series forecasting of water quality parameters by combining advanced deep learning algorithms (i.e., Long Short-Term Memory (LSTM) and Informer) with causal inference, time-frequency analysis, and uncertainty quantification. The framework was demonstrated for total nitrogen (TN) forecasting in the largest artificial lakes in Asia (i.e., the Danjiangkou Reservoir, China) with six-year monitoring data from January 2017 to June 2022. The results showed that the pre-processing techniques based on causal inference and wavelet decomposition can significantly improve the performance of deep learning algorithms. Compared to the individual LSTM and Informer models, wavelet-coupled approaches diminished well the apparent forecasting errors of TN concentrations, with 24.39%, 32.68%, and 41.26% reduction at most in the average, standard deviation, and maximum values of the errors, respectively. In addition, a post-processing algorithm based on the Copula function and Bayesian theory was designed to quantify the uncertainty of predictions. With the help of this algorithm, each deterministic prediction of our model can correspond to a range of possible outputs. The 95% forecast confidence interval covered almost all the observations, which proves a measure of the reliability and robustness of the predictions. This study provides rich scientific references for applying advanced data-driven methods in time-series forecasting tasks and a practical methodological framework for water resources management and similar projects.
摘要:
河流系统水质变量的准确预测对于相关管理人员识别潜在的水质恶化问题并及时采取对策至关重要。然而,在当今更为复杂的环境中,纯数据驱动的预测模型往往不足以处理水质高度变化的周期性。本研究通过结合先进的深度学习算法(即,具有因果推断的长短期记忆(LSTM)和告示者),时频分析,和不确定性量化。该框架已在亚洲最大的人工湖(即,丹江口水库,中国),2017年1月至2022年6月的6年监测数据。结果表明,基于因果推断和小波分解的预处理技术可以显著提高深度学习算法的性能。与单独的LSTM和Informer模型相比,小波耦合方法很好地减少了TN浓度的明显预测误差,24.39%,32.68%,平均最多可减少41.26%,标准偏差,和误差的最大值,分别。此外,设计了一种基于Copula函数和贝叶斯理论的后处理算法来量化预测的不确定性。在这个算法的帮助下,我们模型的每个确定性预测都可以对应于一系列可能的输出。95%的预测置信区间几乎涵盖了所有的观测值,这证明了预测的可靠性和鲁棒性。这项研究为在时间序列预测任务中应用先进的数据驱动方法提供了丰富的科学参考,并为水资源管理和类似项目提供了实用的方法框架。
公众号