Ensemble learning

合奏学习
  • 文章类型: Journal Article
    燃料车辆二氧化碳(CO2)排放量的不断增加在大气中产生温室效应,这对全球变暖和气候变化产生了负面影响,并引发了对环境可持续性的严重关切。因此,关于估算和减少车辆CO2排放的研究对于促进环境可持续性和减少大气中的温室气体排放至关重要。
    本研究使用基于机器学习的18种不同回归算法进行了比较回归分析,合奏学习,和深度学习范式来评估和预测燃料汽车的二氧化碳排放。使用包括R2、调整后的R2、均方根误差(RMSE)、和运行时。
    研究结果表明,集成学习方法具有更高的预测精度和更低的错误率。集成学习算法,包括极端梯度提升(XGB),随机森林,和光梯度提升机(LGBM)表现出高R2和低RMSE值。因此,这些基于集成学习的算法被发现是预测二氧化碳排放的最有效方法。虽然深度学习模型具有复杂的结构,比如卷积神经网络(CNN),深度神经网络(DNN)和门控循环单元(GRU),实现了高R2值,人们发现他们需要更长的时间来训练,需要更多的计算资源。我们的研究方法和结果为不同利益相关者争取环境可持续性和生态世界提供了许多重要意义。
    UNASSIGNED: The continuous increase in carbon dioxide (CO2) emissions from fuel vehicles generates a greenhouse effect in the atmosphere, which has a negative impact on global warming and climate change and raises serious concerns about environmental sustainability. Therefore, research on estimating and reducing vehicle CO2 emissions is crucial in promoting environmental sustainability and reducing greenhouse gas emissions in the atmosphere.
    UNASSIGNED: This study performed a comparative regression analysis using 18 different regression algorithms based on machine learning, ensemble learning, and deep learning paradigms to evaluate and predict CO2 emissions from fuel vehicles. The performance of each algorithm was evaluated using metrics including R2, Adjusted R2, root mean square error (RMSE), and runtime.
    UNASSIGNED: The findings revealed that ensemble learning methods have higher prediction accuracy and lower error rates. Ensemble learning algorithms that included Extreme Gradient Boosting (XGB), Random Forest, and Light Gradient-Boosting Machine (LGBM) demonstrated high R2 and low RMSE values. As a result, these ensemble learning-based algorithms were discovered to be the most effective methods of predicting CO2 emissions. Although deep learning models with complex structures, such as the convolutional neural network (CNN), deep neural network (DNN) and gated recurrent unit (GRU), achieved high R2 values, it was discovered that they take longer to train and require more computational resources. The methodology and findings of our research provide a number of important implications for the different stakeholders striving for environmental sustainability and an ecological world.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本研究旨在建立一种不平衡的集合模型,该模型可以准确预测冠心病(CHD)合并高血压患者的死亡结果,并评估导致死亡的因素。
    收集1058例冠心病合并高血压患者的病历,并排除急性冠脉综合征。患者在第一次随访时,第三,第六,出院后第12个月记录死亡事件。出院后两年随访结束。患者分为存活组和非存活组。根据医疗记录,性别,吸烟,饮酒,COPD,脑中风,糖尿病,高同型半胱氨酸血症,对两组心力衰竭和肾功能不全等影响因素进行排序比较,进行特征选择以构建模型。由于数据失衡,我们开发了四种基于平衡随机森林(BRF)的不平衡集成预测模型,EasyEnsemble,RUSBoost,SMOTEBoost和基于AdaBoost和Logistic的两种基分类算法。每个模型使用基于GridSearchCV的超参数进行优化,并使用曲线下面积(AUC)进行评估,灵敏度,召回,Brier得分,和几何平均值(G-mean)。此外,为了理解变量对模型性能的影响,在最优模型的基础上构建了SHapley加法解释(SHAP)模型。
    年龄差异显著,心率,COPD,脑中风,与存活组相比,非存活组的心力衰竭和肾功能不全。在所有型号中,BRF的AUC最高(0.810;95%CI,0.778-0.839),灵敏度(0.990;95%CI,0.981-1.000),召回(0.990;95%CI,0.981-1.000),和G-平均值(0.806;95%CI,0.778-0.827),最低的Brier得分(0.181;95%CI,0.178-0.185)。因此,我们确定BRF为最优模型。此外,红细胞计数(RBC),体重指数(BMI),发现乳酸脱氢酶是重要的死亡相关危险因素.
    BRF与先进的机器学习方法和SHAP相结合非常有效,可以准确预测CHD合并高血压患者的死亡率。该模型有可能帮助临床医生修改治疗策略以改善患者预后。
    UNASSIGNED: This study sought to develop an unbalanced-ensemble model that could accurately predict death outcomes of patients with comorbid coronary heart disease (CHD) and hypertension and evaluate the factors contributing to death.
    UNASSIGNED: Medical records of 1058 patients with coronary heart disease combined with hypertension and excluding those acute coronary syndrome were collected. Patients were followed-up at the first, third, sixth, and twelfth months after discharge to record death events. Follow-up ended two years after discharge. Patients were divided into survival and nonsurvival groups. According to medical records, gender, smoking, drinking, COPD, cerebral stroke, diabetes, hyperhomocysteinemia, heart failure and renal insufficiency of the two groups were sorted and compared and other influencing factors of the two groups, feature selection was carried out to construct models. Owing to data unbalance, we developed four unbalanced-ensemble prediction models based on Balanced Random Forest (BRF), EasyEnsemble, RUSBoost, SMOTEBoost and the two base classification algorithms based on AdaBoost and Logistic. Each model was optimised using hyperparameters based on GridSearchCV and evaluated using area under the curve (AUC), sensitivity, recall, Brier score, and geometric mean (G-mean). Additionally, to understand the influence of variables on model performance, we constructed a SHapley Additive explanation (SHAP) model based on the optimal model.
    UNASSIGNED: There were significant differences in age, heart rate, COPD, cerebral stroke, heart failure and renal insufficiency in the nonsurvival group compared with the survival group. Among all models, BRF yielded the highest AUC (0.810; 95% CI, 0.778-0.839), sensitivity (0.990; 95% CI, 0.981-1.000), recall (0.990; 95% CI, 0.981-1.000), and G-mean (0.806; 95% CI, 0.778-0.827), and the lowest Brier score (0.181; 95% CI, 0.178-0.185). Therefore, we identified BRF as the optimal model. Furthermore, red blood cell count (RBC), body mass index (BMI), and lactate dehydrogenase were found to be important mortality-associated risk factors.
    UNASSIGNED: BRF combined with advanced machine learning methods and SHAP is highly effective and accurately predicts mortality in patients with CHD comorbid with hypertension. This model has the potential to assist clinicians in modifying treatment strategies to improve patient outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    土壤水分在地表和大气之间的水和热交换中起着重要作用,它对农业生产非常重要,生态规划,和水资源管理。尽管微波遥感在大规模土壤水分监测中得到了广泛的应用,对于高植被覆盖率和高土壤异质性的区域,无法保证缩小尺度的反演结果的准确性。为了应对这些挑战,本研究通过计算Pearson相关系数(R)和最大信息系数(MIC),建立了基于MODIS和高程数据的土壤水分指标集,然后,通过使用两种集成学习方法(Bagging和Boosting),构建了有关索引集和低分辨率土壤水分主动被动(SMAP)的决策树模型(梯度提升决策树和随机森林)。将该模型应用于吉林省2017年至2020年的高分辨率土壤湿度指数,以生成1km分辨率的产品。在验证过程中,三重搭配分析(TCA),用粗分辨率和细分辨率比较土壤水分图,和梨树县的现场测量,通榆县,和吉林市用于评估降尺度土壤水分结果与网络地面观测值之间的差异,季节性和点尺度。结果如下:(1)TCA法计算的相关系数(R2)为0.733(GBDT_36km)>0.649(RF_36km),误差方差为0.0004(GBDT_36km)<0.00058(RF_36km)。(2)网络规模的R为0.798(GBDT_SM)>0.662(RF_SM),RMSE为0.040(GBDT_SM)<0.044(RF_SM),点标度R为0.864(GBDT_SM)>0.833(RF_SM),RMSE为0.029(GBDT_SM)<0.039(RF_SM)。生长期四个阶段的R为GBDT_SM>RF_SM,RMSE为GBDT_SM Soil moisture plays an important role in the water and heat exchanges between the land surface and atmosphere, and it has great importance for agricultural production, ecological planning, and water resources management. Although microwave remote sensing has been widely used in large-scale soil moisture monitoring, the accuracy of the downscaled retrieval results cannot be guaranteed for regions with high vegetation coverage and high soil heterogeneity. To address these challenges, this study built soil moisture indice set based on MODIS and elevation data by calculating the Pearson correlation coefficient (R) and Maximum Information Coefficient (MIC), then constructed decision tree models (Gradient Boosting Decision Tree and Random Forest) about the indice set and low-resolution Soil Moisture Active Passive (SMAP) by using two ensemble learning methods (Bagging and Boosting). The models were applied to the high-resolution soil moisture indices in Jilin Province for the years 2017 to 2020 to generate 1 km-resolution products. In the validation process, Triple Collocation Analysis (TCA), comparison of soil moisture maps with coarse and fine resolution, and in-situ measurements in Lishu County, Tongyu County, and Jilin City were used to evaluate the differences between downscaling soil moisture results and ground observations at network, seasonal and point scales. The results were as follows: (1) The correlation coefficient (R2) calculated by the TCA method was 0.733 (GBDT_36km) > 0.649 (RF_36km), and the error variance was 0.0004 (GBDT_36km) < 0.00058 (RF_36km). (2) R at network scale was 0.798 (GBDT_SM) > 0.662 (RF_SM), RMSE was 0.040 (GBDT_SM) < 0.044 (RF_SM), the point scale R was 0.864 (GBDT_SM) > 0.833 (RF_SM), RMSE was 0.029 (GBDT_SM) < 0.039 (RF_SM). The R in four stages of the growth period was GBDT_SM > RF_SM, RMSE was GBDT_SM < RF_SM. In conclusion, the GBDT and RF models can reliably downscale soil moisture in Jilin Province, and the Boosting ensemble learning method represented by GBDT had a better estimation performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    细胞类型特异性结构域是空间分辨转录组(SRT)组织中的解剖结构域,其中特定细胞类型同时富集。使用现有的计算方法来检测具有低比例细胞类型的特定域是具有挑战性的,与其他细胞类型特异性结构域部分重叠或甚至在内部。这里,我们建议去现场,它将分割和反卷积合成为一个集合来生成细胞类型的模式,检测低比例的细胞类型特异性结构域,并直观地显示这些领域。实验评估表明,De-spot使我们能够发现癌症相关成纤维细胞和免疫相关细胞之间的共定位,这表明给定切片中潜在的肿瘤微环境(TME)域,被以前的计算方法掩盖了。我们进一步阐明了鉴定的结构域,发现Srgn可能是SRT切片中的关键TME标记。通过破译乳腺癌组织中的T细胞特异性结构域,De-spot还显示,耗竭T细胞的比例在侵袭性与侵袭性之间显着增加。导管癌.
    Cell-type-specific domains are the anatomical domains in spatially resolved transcriptome (SRT) tissues where particular cell types are enriched coincidentally. It is challenging to use existing computational methods to detect specific domains with low-proportion cell types, which are partly overlapped with or even inside other cell-type-specific domains. Here, we propose De-spot, which synthesizes segmentation and deconvolution as an ensemble to generate cell-type patterns, detect low-proportion cell-type-specific domains, and display these domains intuitively. Experimental evaluation showed that De-spot enabled us to discover the co-localizations between cancer-associated fibroblasts and immune-related cells that indicate potential tumor microenvironment (TME) domains in given slices, which were obscured by previous computational methods. We further elucidated the identified domains and found that Srgn may be a critical TME marker in SRT slices. By deciphering T cell-specific domains in breast cancer tissues, De-spot also revealed that the proportions of exhausted T cells were significantly increased in invasive vs. ductal carcinoma.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在印度尼西亚,由于降雨模式的复杂性,对降雨的监测需要具有高分辨率和广泛空间覆盖范围的估算系统。本研究通过整合各种仪器的数据,建立了印度尼西亚的降雨量估算模型,即,雨量计,天气雷达,和气象卫星。一种合奏学习技术,具体来说,极端梯度提升(XGBoost),用于克服由于雨量计点数量有限而导致的稀疏数据,天气雷达覆盖范围有限,和不平衡的降雨数据。该模型包括卫星数据的偏差校正以提高估计精度。此外,来自印度尼西亚安装的几个气象雷达的数据也被合并在一起。这项研究处理了印度尼西亚各种降雨模式下的降雨量估计,比如季节性的,赤道,和当地的模式,具有很高的时间分辨率,接近实时。验证在六个点进行,即,BandarLampung,Banjarmasin,庞蒂亚克,DeliSerdang,Gorontalo,Biak研究结果表明,估计精度较好,分别为0.89、0.91、0.89、0.9、0.92和0.9,均方根误差(RMSE)值为2.75mm/h,2.57mm/h,3.08mm/h,2.64mm/h,1.85mm/h,和2.48毫米/小时。我们的研究强调了该模型在高时空尺度上准确捕获印度尼西亚各种降雨模式的潜力。
    In Indonesia, the monitoring of rainfall requires an estimation system with a high resolution and wide spatial coverage because of the complexities of the rainfall patterns. This study built a rainfall estimation model for Indonesia through the integration of data from various instruments, namely, rain gauges, weather radars, and weather satellites. An ensemble learning technique, specifically, extreme gradient boosting (XGBoost), was applied to overcome the sparse data due to the limited number of rain gauge points, limited weather radar coverage, and imbalanced rain data. The model includes bias correction of the satellite data to increase the estimation accuracy. In addition, the data from several weather radars installed in Indonesia were also combined. This research handled rainfall estimates in various rain patterns in Indonesia, such as seasonal, equatorial, and local patterns, with a high temporal resolution, close to real time. The validation was carried out at six points, namely, Bandar Lampung, Banjarmasin, Pontianak, Deli Serdang, Gorontalo, and Biak. The research results show good estimation accuracy, with respective values of 0.89, 0.91, 0.89, 0.9, 0.92, and 0.9, and root mean square error (RMSE) values of 2.75 mm/h, 2.57 mm/h, 3.08 mm/h, 2.64 mm/h, 1.85 mm/h, and 2.48 mm/h. Our research highlights the potential of this model to accurately capture diverse rainfall patterns in Indonesia at high spatial and temporal scales.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    深层滑坡可能会从蠕滑运动中释放出许多微震信号,其中包括斜坡表面的岩土滑移和地下的岩土剪切破裂。机器学习可以有效地增强滑坡地震监测中微震信号的分类,并解释滑坡运动的力学过程。在本文中,在深层滑坡内部部署了八套三轴地震传感器,酒仙平,中国,通过1年的连续监测,获得了大量与边坡运动有关的微震信号。所有数据都通过地震事件识别模式,长期平均值和短期平均值的比率。我们选取了11天的数据,手动将4131个数据分为八类,并创建了一个微震事件数据库。本文对经典的机器学习算法和集成学习算法进行了测试。为了评估各算法模型的地震事件分类性能,我们通过精度的维度评估了所提出的算法,精度,并召回每个模型。验证结果表明,经典机器学习算法中性能最好的决策树算法的准确率为88.75%,而集成算法,包括随机森林,梯度提升树,极端梯度提升,和轻型梯度增压机,精度范围从93.5%到94.2%,在精度的综合评价中也取得了更好的结果,召回,F1得分。每个微震事件类别的特定分类测试显示相同的结果。结果表明,与经典的机器学习算法相比,集成学习算法显示出更好的结果。
    A deep-seated landslide could release numerous microseismic signals from creep-slip movement, which includes a rock-soil slip from the slope surface and a rock-soil shear rupture in the subsurface. Machine learning can effectively enhance the classification of microseismic signals in landslide seismic monitoring and interpret the mechanical processes of landslide motion. In this paper, eight sets of triaxial seismic sensors were deployed inside the deep-seated landslide, Jiuxianping, China, and a large number of microseismic signals related to the slope movement were obtained through 1-year-long continuous monitoring. All the data were passed through the seismic event identification mode, the ratio of the long-time average and short-time average. We selected 11 days of data, manually classified 4131 data into eight categories, and created a microseismic event database. Classical machine learning algorithms and ensemble learning algorithms were tested in this paper. In order to evaluate the seismic event classification performance of each algorithmic model, we evaluated the proposed algorithms through the dimensions of the accuracy, precision, and recall of each model. The validation results demonstrated that the best performing decision tree algorithm among the classical machine learning algorithms had an accuracy of 88.75%, while the ensemble algorithms, including random forest, Gradient Boosting Trees, Extreme Gradient Boosting, and Light Gradient Boosting Machine, had an accuracy range from 93.5% to 94.2% and also achieved better results in the combined evaluation of the precision, recall, and F1 score. The specific classification tests for each microseismic event category showed the same results. The results suggested that the ensemble learning algorithms show better results compared to the classical machine learning algorithms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    主动干扰和其他雷达的相互干扰严重损害了雷达的检测性能。本文提出了一种无线电信号调制识别方法来准确识别这些信号,这有助于干扰取消决定。基于元特征增强改进的集成学习堆叠算法,所提出的方法采用随机森林,K-最近的邻居,和高斯朴素贝叶斯作为基础学习者,逻辑回归作为元学习者。它以信号的多域特征作为输入,包括模糊熵在内的时域特征,斜率熵,和Hjorth参数;频域特征,包括谱熵;和分形域特征,包括分形维数。模拟实验,包括雷达和有源干扰的七种常见信号类型,进行有效性验证和性能评估。结果证明了该方法相对于其他分类方法的性能优势,以及其满足低信噪比和少射学习要求的能力。
    The detection performance of radar is significantly impaired by active jamming and mutual interference from other radars. This paper proposes a radio signal modulation recognition method to accurately recognize these signals, which helps in the jamming cancellation decisions. Based on the ensemble learning stacking algorithm improved by meta-feature enhancement, the proposed method adopts random forests, K-nearest neighbors, and Gaussian naive Bayes as the base-learners, with logistic regression serving as the meta-learner. It takes the multi-domain features of signals as input, which include time-domain features including fuzzy entropy, slope entropy, and Hjorth parameters; frequency-domain features, including spectral entropy; and fractal-domain features, including fractal dimension. The simulation experiment, including seven common signal types of radar and active jamming, was performed for the effectiveness validation and performance evaluation. Results proved the proposed method\'s performance superiority to other classification methods, as well as its ability to meet the requirements of low signal-to-noise ratio and few-shot learning.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高吸水性聚合物(SAP)具有增强新鲜和硬化形式的水泥基复合材料的特性的能力。然而,必须认识到SAP混凝土的强度可能会降低。通过改变混凝土组成并选择合适的SAP类型,可以减少这种减少。这项工作采用机器学习(ML)来解决强度下降的问题。该分析考虑了与具体组成和SAP类型相关的十个不同变量。该研究使用涉及回归和分类任务的机器学习方法。集成学习的使用大大提高了结果的质量和准确性,显示了它在组合几个模型以产生更精确预测方面的优势。研究结果表明,支持向量机(SVM)和极限梯度提升(XGBoost)回归算法可以准确预测SAP混凝土强度降低的百分比。这些预测是基于具体的组成和SAP细节,分别导致0.90和0.88的R2值。此外,XGBoost表现出最高的精度,与各种分类算法相比,达到0.94。根据结果,集成模型的均方误差(MSE)显示出优异的结果。此外,沙普利加法扩张(SHAP)揭示了一些变量,包括SAP%,SAP大小,和抗压强度,对强度折减模型有显著影响。本研究旨在通过开发一个Web应用程序来弥合学术研究和实际应用之间的差距,该应用程序采用集成学习来精确预测由于使用SAP而导致的抗压强度降低。
    Super absorbent polymer (SAP) has a capacity to enhance the characteristics of cementitious composites in both their fresh and hardened forms. However, it is essential to recognize that the strength of SAP concrete may decrease. By altering the concrete composition and selecting the appropriate type of SAP, it is possible to reduce this reduction. This work employs machine learning (ML) to tackle the issue of strength degradation. The analysis considers ten distinct variables linked to concrete composition and the type of SAP. The study uses machine learning approaches that involve both regression and classification tasks. The use of ensemble learning greatly improves the quality and accuracy of the results, showing its superiority in combining several models to produce more precise predictions. The findings demonstrate that the Support Vector Machines (SVM) and Extreme Gradient Boosting (XGBoost) regression algorithms accurately forecasted the percentage of reduction in strength in SAP concrete. These predictions were based on the concrete composition and SAP details, resulting in R2 values of 0.90 and 0.88, respectively. Furthermore, XGBoost exhibited the highest accuracy, reaching 0.94, when compared to the various categorization algorithms. According to the results, the mean squared error (MSE) of the ensemble model demonstrated superior outcomes. Furthermore, the SHapley Additive exPlanations (SHAP) reveal that some variables, including SAP%, SAP size, and compressive strength, have a significant influence on the strength reduction model. This study aims to bridge the gap between academic research and practical application by developing a web application that employs ensemble learning to precisely forecast the reduction in compressive strength caused by the usage of SAP.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从脑电图(EEG)信号中识别情绪是一项具有挑战性的任务,非线性,和大脑活动的非平稳特征。传统的方法往往无法捕捉到这些微妙的动态,而深度学习方法缺乏可解释性。在这项研究中,我们介绍了一种新颖的集成流形嵌入的三阶段方法,多级异质性复发分析(MHRA),和集成学习来解决这些限制在基于脑电图的情绪识别。
    使用SJTU-SEEDIV数据库评估了所提出的方法。我们首先应用均匀流形近似和投影(UMAP)将62导联EEG信号的流形嵌入到低维空间中。然后,我们开发了MHRA来表征跨多个过渡水平的脑活动的复杂复发动力学。最后,我们采用基于树的集成学习方法对四种情绪进行分类(中性,悲伤,恐惧,快乐)基于提取的MHRA特征。
    我们的方法实现了高性能,准确度为0.7885,AUC为0.7552,优于同一数据集上的现有方法。此外,我们的方法提供了在不同情绪中最一致的识别性能.敏感性分析显示特定的MHRA指标与每种情绪密切相关,提供对潜在神经动力学的有价值的见解。
    这项研究提出了一种基于EEG的情感识别的新颖框架,该框架有效地捕获了复杂的非线性和非平稳的大脑活动动力学,同时保持了可解释性。所提出的方法为提高我们对情绪处理的理解和开发更可靠的情绪识别系统提供了巨大的潜力,并在医疗保健及其他领域具有广泛的应用。
    UNASSIGNED: Recognizing emotions from electroencephalography (EEG) signals is a challenging task due to the complex, nonlinear, and nonstationary characteristics of brain activity. Traditional methods often fail to capture these subtle dynamics, while deep learning approaches lack explainability. In this research, we introduce a novel three-phase methodology integrating manifold embedding, multilevel heterogeneous recurrence analysis (MHRA), and ensemble learning to address these limitations in EEG-based emotion recognition.
    UNASSIGNED: The proposed methodology was evaluated using the SJTU-SEED IV database. We first applied uniform manifold approximation and projection (UMAP) for manifold embedding of the 62-lead EEG signals into a lower-dimensional space. We then developed MHRA to characterize the complex recurrence dynamics of brain activity across multiple transition levels. Finally, we employed tree-based ensemble learning methods to classify four emotions (neutral, sad, fear, happy) based on the extracted MHRA features.
    UNASSIGNED: Our approach achieved high performance, with an accuracy of 0.7885 and an AUC of 0.7552, outperforming existing methods on the same dataset. Additionally, our methodology provided the most consistent recognition performance across different emotions. Sensitivity analysis revealed specific MHRA metrics that were strongly associated with each emotion, offering valuable insights into the underlying neural dynamics.
    UNASSIGNED: This study presents a novel framework for EEG-based emotion recognition that effectively captures the complex nonlinear and nonstationary dynamics of brain activity while maintaining explainability. The proposed methodology offers significant potential for advancing our understanding of emotional processing and developing more reliable emotion recognition systems with broad applications in healthcare and beyond.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    多基因风险评分(PRS)可增强人群风险分层并推进个性化医疗,但是现有的方法面临着一些限制,涵盖与计算负担相关的问题,预测准确性,以及对广泛遗传结构的适应性。为了解决这些问题,我们建议使用汇总级数据(ALL-Sum)聚合L0Learn,一种快速且可扩展的集成学习方法,用于使用来自全基因组关联研究(GWAS)的汇总统计来计算PRS。ALL-Sum利用L0L2惩罚回归和跨调整参数的集成学习来灵活地对具有不同遗传架构的性状进行建模。在广泛的大规模模拟中,广泛的多遗传性和GWAS样本量,在预测准确性方面,ALL-Sum始终优于流行的替代方法,运行时,内存使用量减少10%,20倍,还有三个,分别,并证明了对不同遗传架构的稳健性。我们使用来自9个数据源的GWAS汇总统计数据验证了ALL-Sum在11个复杂性状的实际数据分析中的性能,包括全球脂质遗传学联盟,乳腺癌协会联合会,和FinnGen生物银行,在英国生物银行进行验证。我们的结果表明,平均而言,ALL-Sum获得的PRS平均准确度提高25%,比当前最先进的方法快15倍的计算速度和一半的内存,并且在广泛的特征和疾病中表现强劲。此外,当使用从不同数据源计算的连锁不平衡时,我们的方法显示出稳定的预测。ALL-Sum作为用户友好的R软件包提供,具有公开可用的参考数据,用于简化分析。
    Polygenic risk scores (PRS) enhance population risk stratification and advance personalized medicine, but existing methods face several limitations, encompassing issues related to computational burden, predictive accuracy, and adaptability to a wide range of genetic architectures. To address these issues, we propose Aggregated L0Learn using Summary-level data (ALL-Sum), a fast and scalable ensemble learning method for computing PRS using summary statistics from genome-wide association studies (GWAS). ALL-Sum leverages a L0L2 penalized regression and ensemble learning across tuning parameters to flexibly model traits with diverse genetic architectures. In extensive large-scale simulations across a wide range of polygenicity and GWAS sample sizes, ALL-Sum consistently outperformed popular alternative methods in terms of prediction accuracy, runtime, and memory usage by 10%, 20-fold, and threefold, respectively, and demonstrated robustness to diverse genetic architectures. We validated the performance of ALL-Sum in real data analysis of 11 complex traits using GWAS summary statistics from nine data sources, including the Global Lipids Genetics Consortium, Breast Cancer Association Consortium, and FinnGen Biobank, with validation in the UK Biobank. Our results show that on average, ALL-Sum obtained PRS with 25% higher accuracy on average, with 15 times faster computation and half the memory than the current state-of-the-art methods, and had robust performance across a wide range of traits and diseases. Furthermore, our method demonstrates stable prediction when using linkage disequilibrium computed from different data sources. ALL-Sum is available as a user-friendly R software package with publicly available reference data for streamlined analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号