关键词: Acute respiratory infection Demographic health survey Feature selection Machine learning

Mesh : Humans Respiratory Tract Infections / epidemiology Child, Preschool Machine Learning Africa South of the Sahara / epidemiology Infant Female Male Particulate Matter / analysis Acute Disease Air Pollution / adverse effects Infant, Newborn

来  源:   DOI:10.1038/s41598-024-65620-1   PDF(Pubmed)

Abstract:
Symptoms of Acute Respiratory infections (ARIs) among under-five children are a global health challenge. We aimed to train and evaluate ten machine learning (ML) classification approaches in predicting symptoms of ARIs reported by mothers among children younger than 5 years in sub-Saharan African (sSA) countries. We used the most recent (2012-2022) nationally representative Demographic and Health Surveys data of 33 sSA countries. The air pollution covariates such as global annual surface particulate matter (PM 2.5) and the nitrogen dioxide available in the form of raster images were obtained from the National Aeronautics and Space Administration (NASA). The MLA was used for predicting the symptoms of ARIs among under-five children. We randomly split the dataset into two, 80% was used to train the model, and the remaining 20% was used to test the trained model. Model performance was evaluated using sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve. A total of 327,507 under-five children were included in the study. About 7.10, 4.19, 20.61, and 21.02% of children reported symptoms of ARI, Severe ARI, cough, and fever in the 2 weeks preceding the survey years respectively. The prevalence of ARI was highest in Mozambique (15.3%), Uganda (15.05%), Togo (14.27%), and Namibia (13.65%,), whereas Uganda (40.10%), Burundi (38.18%), Zimbabwe (36.95%), and Namibia (31.2%) had the highest prevalence of cough. The results of the random forest plot revealed that spatial locations (longitude, latitude), particulate matter, land surface temperature, nitrogen dioxide, and the number of cattle in the houses are the most important features in predicting the diagnosis of symptoms of ARIs among under-five children in sSA. The RF algorithm was selected as the best ML model (AUC = 0.77, Accuracy = 0.72) to predict the symptoms of ARIs among children under five. The MLA performed well in predicting the symptoms of ARIs and associated predictors among under-five children across the sSA countries. Random forest MLA was identified as the best classifier to be employed for the prediction of the symptoms of ARI among under-five children.
摘要:
5岁以下儿童的急性呼吸道感染(ARIs)症状是一个全球性的健康挑战。我们旨在训练和评估十种机器学习(ML)分类方法,以预测撒哈拉以南非洲(sSA)国家5岁以下儿童母亲报告的ARI症状。我们使用了33个sSA国家的最新(2012-2022年)具有全国代表性的人口和健康调查数据。从美国国家航空航天局(NASA)获得了空气污染协变量,例如全球年度表面颗粒物(PM2.5)和以栅格图像形式提供的二氧化氮。MLA用于预测5岁以下儿童的ARIs症状。我们把数据集随机分成两部分,80%用于训练模型,剩下的20%用于测试训练的模型。使用灵敏度评估模型性能,特异性,准确度,和接收器工作特性曲线下的面积。共有327,507名五岁以下儿童被纳入研究。约7.10、4.19、20.61和21.02%的儿童报告有ARI症状,严重ARI,咳嗽,和发烧分别在调查年度前2周。莫桑比克的ARI患病率最高(15.3%),乌干达(15.05%),多哥(14.27%),和纳米比亚(13.65%,),而乌干达(40.10%),布隆迪(38.18%),津巴布韦(36.95%),纳米比亚(31.2%)的咳嗽患病率最高。随机森林图的结果表明,空间位置(经度,latitude),颗粒物,地表温度,二氧化氮,房屋中的牛数量是预测sSA中五岁以下儿童ARIs症状诊断的最重要特征。选择RF算法作为预测5岁以下儿童ARIs症状的最佳ML模型(AUC=0.77,准确度=0.72)。MLA在预测sSA国家五岁以下儿童的ARI症状和相关预测因素方面表现良好。随机森林MLA被确定为用于预测五岁以下儿童的ARI症状的最佳分类器。
公众号