关键词: Content inversion Heavy metals in soil Hyperspectral characteristic band Machine learning Stacking model

Mesh : Soil Environmental Monitoring / methods Soil Pollutants / analysis Metals, Heavy / analysis China Machine Learning

来  源:   DOI:10.1016/j.jenvman.2024.120503

Abstract:
The global concern regarding the adverse effects of heavy metal pollution in soil has grown significantly. Accurate prediction of heavy metal content in soil is crucial for environmental protection. This study proposes an inversion analysis method for heavy metals (As, Cd, Cr, Cu, Ni, Pb) in soil based on hyperspectral and machine learning algorithms for 21 soil reference materials from multiple provinces in China. On this basis, an integrated learning model called Stacked RF (the base model is XGBoost, LightGBM, CatBoost, and the meta-model is RF) was established to perform soil heavy metal inversion. Specifically, three popular algorithms were initially employed to preprocess the spectral data, then Random Forest (RF) was used to select the best feature bands to reduce the impact of noise, finally Stacking and four basic machine learning algorithms were used to establish comparisons and analysis of inversion model. Compared with traditional machine learning methods, the stacking model showcases enhanced stability and superior accuracy. Research results indicate that machine learning algorithms, especially ensemble learning models, have better inversion effects on heavy metals in soil. Overall, the MF-RF-Stacking model performed best in the inversion of the six heavy metals. The research results will provide a new perspective on the ensemble learning model method for soil heavy metal content inversion using data of hyperspectral characteristic bands collected from soil reference materials.
摘要:
全球对土壤重金属污染的不利影响的关注已大大增加。土壤中重金属含量的准确预测对环境保护至关重要。本研究提出了一种重金属(As,Cd,Cr,Cu,Ni,基于高光谱和机器学习算法对来自中国多个省份的21种土壤参考物质的土壤中Pb)。在此基础上,一个名为StackedRF的集成学习模型(基本模型是XGBoost,LightGBM,CatBoost,元模型为RF)进行土壤重金属反演。具体来说,最初采用了三种流行的算法来预处理光谱数据,然后随机森林(RF)被用来选择最佳的特征波段,以减少噪声的影响,最后利用Stacking和4种基本的机器学习算法建立反演模型进行比较分析。与传统的机器学习方法相比,堆叠模型展示了增强的稳定性和卓越的准确性。研究结果表明,机器学习算法,尤其是合奏学习模型,对土壤中重金属有较好的反演效果。总的来说,MF-RF-Stacking模型在六种重金属的反演中表现最好。研究结果将为利用土壤参考材料高光谱特征波段数据反演土壤重金属含量的集成学习模型方法提供新的视角。
公众号