Ensemble methods

  • 文章类型: Journal Article
    This study employs machine learning techniques to identify factors that influence extended Emergency Department (ED) length of stay (LOS) and derives transparent decision rules to complement the results. Leveraging a comprehensive dataset, Gradient Boosting exhibited marginally superior predictive performance compared to Random Forest for LOS classification. Notably, variables like triage acuity and the Elixhauser Comorbidity Index (ECI) emerged as robust predictors. The extracted rules optimize LOS stratification and resource allocation, demonstrating the critical role of data-driven methodologies in improving ED workflow efficiency and patient care delivery.






  • 文章类型: Journal Article
    Pneumonia is a severe health concern, particularly for vulnerable groups, needing early and correct classification for optimal treatment. This study addresses the use of deep learning combined with machine learning classifiers (DLxMLCs) for pneumonia classification from chest X-ray (CXR) images. We deployed modified VGG19, ResNet50V2, and DenseNet121 models for feature extraction, followed by five machine learning classifiers (logistic regression, support vector machine, decision tree, random forest, artificial neural network). The approach we suggested displayed remarkable accuracy, with VGG19 and DenseNet121 models obtaining 99.98% accuracy when combined with random forest or decision tree classifiers. ResNet50V2 achieved 99.25% accuracy with random forest. These results illustrate the advantages of merging deep learning models with machine learning classifiers in boosting the speedy and accurate identification of pneumonia. The study underlines the potential of DLxMLC systems in enhancing diagnostic accuracy and efficiency. By integrating these models into clinical practice, healthcare practitioners could greatly boost patient care and results. Future research should focus on refining these models and exploring their application to other medical imaging tasks, as well as including explainability methodologies to better understand their decision-making processes and build trust in their clinical use. This technique promises promising breakthroughs in medical imaging and patient management.






  • 文章类型: Journal Article
    Coffee Breeding programs have traditionally relied on observing plant characteristics over years, a slow and costly process. Genomic selection (GS) offers a DNA-based alternative for faster selection of superior cultivars. Stacking Ensemble Learning (SEL) combines multiple models for potentially even more accurate selection. This study explores SEL potential in coffee breeding, aiming to improve prediction accuracy for important traits [yield (YL), total number of the fruits (NF), leaf miner infestation (LM), and cercosporiosis incidence (Cer)] in Coffea Arabica. We analyzed data from 195 individuals genotyped for 21,211 single-nucleotide polymorphism (SNP) markers. To comprehensively assess model performance, we employed a cross-validation (CV) scheme. Genomic Best Linear Unbiased Prediction (GBLUP), multivariate adaptive regression splines (MARS), Quantile Random Forest (QRF), and Random Forest (RF) served as base learners. For the meta-learner within the SEL framework, various options were explored, including Ridge Regression, RF, GBLUP, and Single Average. The SEL method was able to predict the predictive ability (PA) of important traits in Coffea Arabica. SEL presented higher PA compared with those obtained for all base learner methods. The gains in PA in relation to GBLUP were 87.44% (the ratio between the PA obtained from best Stacking model and the GBLUP), 37.83%, 199.82%, and 14.59% for YL, NF, LM and Cer, respectively. Overall, SEL presents a promising approach for GS. By combining predictions from multiple models, SEL can potentially enhance the PA of GS for complex traits.






  • 文章类型: Journal Article
    In today\'s digital world, with growing population and increasing pollution, unhealthy lifestyle habits like irregular eating, junk food consumption, and lack of exercise are becoming more common, leading to various health problems, including kidney issues. These factors directly affect human kidney health. To address this, we require early detection techniques that rely on text data. Text data contains detailed information about a patient\'s medical history, symptoms, test results, and treatment plans, giving a complete picture of kidney health and enabling timely intervention. In this research paper, we proposed a range of sophisticated models, such as Gradient Boosting Classifier, Light GBM, CatBoost, Support Vector Classifier (SVC), Random Boost, Logistic Regression, XGBoost, Deep Neural Network (DNN), and an Improved DNN. The Improved DNN demonstrated exceptional performance, with an accuracy of 90 %, precision of 89 %, recall of 90 %, and an F1-Score of 89.5 %. By combining traditional machine learning and deep neural networks, this integrative approach enables the identification of intricate patterns in datasets. The model\'s data-driven processes consistently update internal parameters, guaranteeing flexibility in response to evolving healthcare settings. This research represents a notable advancement in the progress of creating a more detailed and individualised ability to diagnose kidney stones, which could potentially lead to better clinical results and patient treatment.






  • 文章类型: Journal Article
    This research proposes a novel, three-tier AI-based scheme for the allocation of carbon-neutral mobility hubs. Initially, it identified optimal sites using a genetic algorithm, which optimized travel times and achieved a high fitness value of 77,000,000. Second, it involved an Ensemble-based suitability analysis of the pinpointed locations, using factors such as land use mix, densities of population and employment, and proximities of parking, biking, and transit. Each factor is weighted by its carbon emissions contribution, then incorporated into a suitability analysis model, generating scores that guide the final selection of the most suitable mobility hub sites. The final step employs a traffic assignment model to evaluate these sites\' environmental and economic impacts. This includes measuring reductions in vehicle kilometers traveled and calculating other cost savings. Focusing on addressing sustainable development goals 11 and 9, this study leverages advanced techniques to enhance transportation planning policies. The Ensemble model demonstrated strong predictive accuracy, achieving an R-squared of 95% in training and 53% in testing. The identified hubs\' sites reduced daily vehicle travel by 771,074 km, leading to annual savings of 225.5 million USD. This comprehensive approach integrates carbon-focused analyses and post-assessment evaluations, thereby offering a comprehensive framework for sustainable mobility hub planning.






  • 文章类型: Journal Article
    Obstructive sleep apnea/hypopnea syndrome (OSAHS) is a condition linked to severe cardiovascular and neuropsychological consequences, characterized by recurrent episodes of partial or complete upper airway obstruction during sleep, leading to compromised ventilation, hypoxemia, and micro-arousals. Polysomnography (PSG) serves as the gold standard for confirming OSAHS, yet its extended duration, high cost, and limited availability pose significant challenges. In this paper, we employ a range of machine learning techniques, including Neural Networks, Decision Trees, Random Forests, and Extra Trees, for OSAHS diagnosis. This approach aims to achieve a diagnostic process that is not only more accessible but also more efficient. The dataset utilized in this study consists of records from 601 adults assessed between 2014 and 2016 at a specialized sleep medical center in Colombia. This research underscores the efficacy of ensemble methods, specifically Random Forests and Extra Trees, achieving an area under the Receiver Operating Characteristic (ROC) curve of 89.2% and 89.6%, respectively. Additionally, a web application has been devised, integrating the optimal model, empowering qualified medical practitioners to make informed decisions through patient registration, an input of 18 variables, and the utilization of the Random Forests model for OSAHS screening.






  • 文章类型: Journal Article
    Forecasting is of great importance in the field of renewable energies because it allows us to know the quantity of energy that can be produced, and thus, to have an efficient management of energy sources. However, determining which prediction system is more adequate is very complex, as each energy infrastructure is different. This work studies the influence of some variables when making predictions using ensemble methods for different locations. In particular, the proposal analyzes the influence of the aspects: the variation of the sampling frequency of solar panel systems, the influence of the type of neural network architecture and the number of ensemble method blocks for each model. Following comprehensive experimentation across multiple locations, our study has identified the most effective solar energy prediction model tailored to the specific conditions of each energy infrastructure. The results offer a decisive framework for selecting the optimal system for accurate and efficient energy forecasting. The key point is the use of short time intervals, which is independent of type of prediction model and of their ensemble method.






  • 文章类型: Journal Article
    The recent developments in quantum technology have opened up new opportunities for machine learning algorithms to assist the healthcare industry in diagnosing complex health disorders, such as heart disease. In this work, we summarize the effectiveness of QuEML in heart disease prediction. To evaluate the performance of QuEML against traditional machine learning algorithms, the Kaggle heart disease dataset was used which contains 1190 samples out of which 53% of samples are labeled as positive samples and rest 47% samples are labeled as negative samples. The performance of QuEML was evaluated in terms of accuracy, precision, recall, specificity, F1 score, and training time against traditional machine learning algorithms. From the experimental results, it has been observed that proposed quantum approaches predicted around 50.03% of positive samples as positive and an average of 44.65% of negative samples are predicted as negative whereas traditional machine learning approaches could predict around 49.78% of positive samples as positive and 44.31% of negative samples as negative. Furthermore, the computational complexity of QuEML was measured which consumed average of 670 µs for its training whereas traditional machine learning algorithms could consume an average 862.5 µs for training. Hence, QuEL was found to be a promising approach in heart disease prediction with an accuracy rate of 0.6% higher and training time of 192.5 µs faster than that of traditional machine learning approaches.






  • 文章类型: Journal Article
    Kinetic process models are widely applied in science and engineering, including atmospheric, physiological and technical chemistry, reactor design, or process optimization. These models rely on numerous kinetic parameters such as reaction rate, diffusion or partitioning coefficients. Determining these properties by experiments can be challenging, especially for multiphase systems, and researchers often face the task of intuitively selecting experimental conditions to obtain insightful results. We developed a numerical compass (NC) method that integrates computational models, global optimization, ensemble methods, and machine learning to identify experimental conditions with the greatest potential to constrain model parameters. The approach is based on the quantification of model output variance in an ensemble of solutions that agree with experimental data. The utility of the NC method is demonstrated for the parameters of a multi-layer model describing the heterogeneous ozonolysis of oleic acid aerosols. We show how neural network surrogate models of the multiphase chemical reaction system can be used to accelerate the application of the NC for a comprehensive mapping and analysis of experimental conditions. The NC can also be applied for uncertainty quantification of quantitative structure-activity relationship (QSAR) models. We show that the uncertainty calculated for molecules that are used to extend training data correlates with the reduction of QSAR model error. The code is openly available as the Julia package KineticCompass.






  • 文章类型: Journal Article
    The rising global incidence of human Mpox cases necessitates prompt and accurate identification for effective disease control. Previous studies have predominantly delved into traditional ensemble methods for detection, we introduce a novel approach by leveraging a metaheuristic-based ensemble framework. In this research, we present an innovative CGO-Ensemble framework designed to elevate the accuracy of detecting Mpox infection in patients. Initially, we employ five transfer learning base models that integrate feature integration layers and residual blocks. These components play a crucial role in capturing significant features from the skin images, thereby enhancing the models\' efficacy. In the next step, we employ a weighted averaging scheme to consolidate predictions generated by distinct models. To achieve the optimal allocation of weights for each base model in the ensemble process, we leverage the Chaos Game Optimization (CGO) algorithm. This strategic weight assignment enhances classification outcomes considerably, surpassing the performance of randomly assigned weights. Implementing this approach yields notably enhanced prediction accuracy compared to using individual models. We evaluate the effectiveness of our proposed approach through comprehensive experiments conducted on two widely recognized benchmark datasets: the Mpox Skin Lesion Dataset (MSLD) and the Mpox Skin Image Dataset (MSID). To gain insights into the decision-making process of the base models, we have performed Gradient Class Activation Mapping (Grad-CAM) analysis. The experimental results showcase the outstanding performance of the CGO-ensemble, achieving an impressive accuracy of 100% on MSLD and 94.16% on MSID. Our approach significantly outperforms other state-of-the-art optimization algorithms, traditional ensemble methods, and existing techniques in the context of Mpox detection on these datasets. These findings underscore the effectiveness and superiority of the CGO-Ensemble in accurately identifying Mpox cases, highlighting its potential in disease detection and classification.





