这项研究旨在解决使用环境效益测绘和分析程序(BenMap)评估空气污染健康影响的准确性挑战。由于气象因子数据有限和污染物数据缺失造成的。通过采用数据增量策略和多种机器学习模型,这项研究探讨了数据量的影响,时间步长,以天津市几年来的数据为例,分析了气象因素对模型预测性能的影响。研究结果表明,增加训练数据量可以提高随机森林回归器(RF)和决策树回归器(DT)模型的性能。特别是预测CO,NO2和PM2.5。最佳预测时间步长因污染物而异,与DT模型实现最高的R2值(0.99)的CO和O3。综合多种气象因素,如大气压力,相对湿度,和露点温度,显著提高了模型精度。当使用三个气象因素时,该模型预测CO的R2为0.99,NO2、PM10、PM2.5和SO2。使用BenMap进行的健康影响评估表明,预测的全因死亡率和特定疾病死亡率与实际值高度一致,确认模型在评估空气污染对健康影响方面的准确性。例如,PM2.5的预测和实际全因死亡率均为3120;对于心血管疾病,两者都是1560年;对于呼吸系统疾病,都是780为了验证其通用性,该方法应用于成都,中国,利用几年的数据对PM2.5、CO、NO2、O3、PM10和SO2,结合大气压,相对湿度,和露点温度。该模型保持了优异的性能,确认其广泛的适用性。总的来说,我们得出的结论是,机器学习和基于BenMap的方法在预测空气污染物浓度和健康影响方面显示出很高的准确性和可靠性,为空气污染评估提供有价值的参考。
This study aims to address accuracy challenges in assessing air pollution health impacts using Environmental Benefits Mapping and Analysis Program (BenMap), caused by limited meteorological factor data and missing pollutant data. By employing data increment strategies and multiple machine learning models, this research explores the effects of data volume, time steps, and meteorological factors on model prediction performance using several years of data from Tianjin City as an example. The findings indicate that increasing training data volume enhances the performance of Random Forest Regressor (RF) and Decision Tree Regressor (DT) models, especially for predicting CO, NO2, and PM2.5. The optimal prediction time step varies by pollutant, with the DT model achieving the highest R2 value (0.99) for CO and O3. Combining multiple meteorological factors, such as atmospheric pressure, relative humidity, and dew point temperature, significantly improves model accuracy. When using three meteorological factors, the model achieves an R2 of 0.99 for predicting CO, NO2, PM10, PM2.5, and SO2. Health impact assessments using BenMap demonstrated that the predicted all-cause mortality and specific disease mortalities were highly consistent with actual values, confirming the model\'s accuracy in assessing health impacts from air pollution. For instance, the predicted and actual all-cause mortality for PM2.5 were both 3120; for cardiovascular disease, both were 1560; and for respiratory disease, both were 780. To validate its generalizability, this method was applied to Chengdu, China, using several years of data for training and prediction of PM2.5, CO, NO2, O3, PM10, and SO2, incorporating atmospheric pressure, relative humidity, and dew point temperature. The model maintained excellent performance, confirming its broad applicability. Overall, we conclude that the machine learning and BenMap-based methods show high accuracy and reliability in predicting air pollutant concentrations and health impacts, providing a valuable reference for air pollution assessment.