关键词: dengue fever machine learning meteorological data prediction

来  源:   DOI:10.3390/tropicalmed9040072   PDF(Pubmed)

Abstract:
OBJECTIVE: This study aimed to improve dengue fever predictions in Singapore using a machine learning model that incorporates meteorological data, addressing the current methodological limitations by examining the intricate relationships between weather changes and dengue transmission.
METHODS: Using weekly dengue case and meteorological data from 2012 to 2022, the data was preprocessed and analyzed using various machine learning algorithms, including General Linear Model (GLM), Support Vector Machine (SVM), Gradient Boosting Machine (GBM), Decision Tree (DT), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost) algorithms. Performance metrics such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and R-squared (R2) were employed.
RESULTS: From 2012 to 2022, there was a total of 164,333 cases of dengue fever. Singapore witnessed a fluctuating number of dengue cases, peaking notably in 2020 and revealing a strong seasonality between March and July. An analysis of meteorological data points highlighted connections between certain climate variables and dengue fever outbreaks. The correlation analyses suggested significant associations between dengue cases and specific weather factors such as solar radiation, solar energy, and UV index. For disease predictions, the XGBoost model showed the best performance with an MAE = 89.12, RMSE = 156.07, and R2 = 0.83, identifying time as the primary factor, while 19 key predictors showed non-linear associations with dengue transmission. This underscores the significant role of environmental conditions, including cloud cover and rainfall, in dengue propagation.
CONCLUSIONS: In the last decade, meteorological factors have significantly influenced dengue transmission in Singapore. This research, using the XGBoost model, highlights the key predictors like time and cloud cover in understanding dengue\'s complex dynamics. By employing advanced algorithms, our study offers insights into dengue predictive models and the importance of careful model selection. These results can inform public health strategies, aiming to improve dengue control in Singapore and comparable regions.
摘要:
目的:这项研究旨在使用结合气象数据的机器学习模型来改善新加坡的登革热预测。通过研究天气变化与登革热传播之间的复杂关系来解决当前方法上的局限性。
方法:使用2012年至2022年的每周登革热病例和气象数据,使用各种机器学习算法对数据进行了预处理和分析,包括一般线性模型(GLM),支持向量机(SVM)梯度增压机(GBM),决策树(DT)随机森林(RF),和极限梯度提升(XGBoost)算法。性能指标,如平均绝对误差(MAE)、均方根误差(RMSE),并且采用R-平方(R2)。
结果:从2012年到2022年,共有164,333例登革热病例。新加坡的登革热病例数量起伏不定,在2020年达到峰值,并在3月至7月之间显示出强烈的季节性。对气象数据点的分析强调了某些气候变量与登革热暴发之间的联系。相关分析表明,登革热病例与太阳辐射等特定天气因素之间存在显着关联,太阳能,紫外线指数。对于疾病预测,XGBoost模型表现出最佳性能,MAE=89.12,RMSE=156.07,R2=0.83,确定时间为主要因素,而19个关键预测因子显示出与登革热传播的非线性关联。这强调了环境条件的重要作用,包括云层和降雨,在登革热传播中。
结论:在过去的十年中,气象因素对新加坡登革热传播有显著影响。这项研究,使用XGBoost模型,在理解登革热的复杂动态时,突出了时间和云层覆盖等关键预测因素。通过采用先进的算法,我们的研究提供了登革热预测模型的见解和仔细选择模型的重要性。这些结果可以为公共卫生策略提供信息,旨在改善新加坡和可比地区的登革热控制。
公众号