machine learning model

机器学习模型
  • 文章类型: Journal Article
    化学预处理是提高木质纤维素废物(LW)累积甲烷产率(CMY)的常用方法,但其效果受多种因素影响。准确估计预处理LW的甲烷产量仍然是一个挑战。这里,基于254个LW样本,使用两个自动ML平台(基于树的管道优化工具和神经网络智能)构建了机器学习(ML)模型来预测预处理原料的甲烷生产性能。此外,预处理条件的相互作用效应,原料性质,通过模型可解释性分析,研究了消化条件对预处理LW产甲烷的影响。最优ML模型在验证集上表现良好,和消化时间,预处理剂,发现木质素含量(LC)是影响预处理LW甲烷产量的关键因素。如果原始LW中的LC低于15%,使用NaOH可以达到最大CMY,KOH,KOH和碱性过氧化氢(AHP),浓度为3.8%,4.4%,和4.5%,分别。另一方面,如果LC高于15%,只有超过4%的高浓度层次分析法才能显著提高甲烷产量。本研究为优化预处理工艺提供了有价值的指导,比较不同的化学预处理方法,并规范大型沼气厂的运行。
    Chemical pretreatment is a common method to enhance the cumulative methane yield (CMY) of lignocellulosic waste (LW) but its effectiveness is subject to various factors, and accurate estimation of methane production of pretreated LW remains a challenge. Here, based on 254 LW samples, a machine learning (ML) model to predict the methane production performance of pretreated feedstock was constructed using two automated ML platforms (tree-based pipeline optimization tool and neural network intelligence). Furthermore, the interactive effects of pretreatment conditions, feedstock properties, and digestion conditions on methane production of pretreated LW were studied through model interpretability analysis. The optimal ML model performed well on the validation set, and the digestion time, pretreatment agent, and lignin content (LC) were found to be key factors affecting the methane production of pretreated LW. If the LC in the raw LW was lower than 15%, the maximum CMY might be achieved using the NaOH, KOH, and alkaline hydrogen peroxide (AHP) with concentrations of 3.8%, 4.4%, and 4.5%, respectively. On the other hand, if LC was higher than 15%, only high concentrations of AHP exceeding 4% could significantly increase methane production. This study provides valuable guidance for optimizing pretreatment process, comparing different chemical pretreatment approaches, and regulating the operation of large-scale biogas plants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    口腔鳞状细胞癌的组织学分级影响预后。在本研究中,我们进行了影像组学分析,从18F-FDGPET图像数据中提取特征,从功能创建机器学习模型,并验证了口腔鳞状细胞癌组织学分级预测的准确性。研究对象为191例患者,术前进行18F-FDG-PET检查,术后确认组织病理学分级,它们的肿瘤大小足以进行影像组学分析。这些患者被分成70%/30%的比例,用作训练数据和测试数据,分别。我们从每位患者的PET图像中提取了2993个影像组学特征。逻辑回归(LR),支持向量机(SVM)随机森林(RF),朴素贝叶斯(NB),并创建了K最近邻(KNN)机器学习模型。从受试者工作特征曲线获得的预测口腔鳞状细胞癌组织学分级的曲线下面积分别为LR的0.72、0.71、0.84、0.74和0.73,SVM,射频,NB,和KNN,分别。我们证实,PET影像组学分析可用于术前预测口腔鳞状细胞癌的组织学分级。
    The histological grade of oral squamous cell carcinoma affects the prognosis. In the present study, we performed a radiomics analysis to extract features from 18F-FDG PET image data, created machine learning models from the features, and verified the accuracy of the prediction of the histological grade of oral squamous cell carcinoma. The subjects were 191 patients in whom an 18F-FDG-PET examination was performed preoperatively and a histopathological grade was confirmed after surgery, and their tumor sizes were sufficient for a radiomics analysis. These patients were split in a 70%/30% ratio for use as training data and testing data, respectively. We extracted 2993 radiomics features from the PET images of each patient. Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Naïve Bayes (NB), and K-Nearest Neighbor (KNN) machine learning models were created. The areas under the curve obtained from receiver operating characteristic curves for the prediction of the histological grade of oral squamous cell carcinoma were 0.72, 0.71, 0.84, 0.74, and 0.73 for LR, SVM, RF, NB, and KNN, respectively. We confirmed that a PET radiomics analysis is useful for the preoperative prediction of the histological grade of oral squamous cell carcinoma.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    高性能混凝土(HPC)抗压强度与其组分之间存在复杂的高维非线性映射关系,对抗压强度的准确预测有很大影响。在本文中,结合BP神经网络(BPNN)的高效稳健软件计算策略,提出了支持向量机(SVM)和遗传算法(GA)用于HPC的抗压强度预测。从以前的文献中提取了8个特征,构建了包含454组数据的抗压强度数据库。对模型进行了训练和测试,以及4个机器学习(ML)模型的性能,即BPNN,SVM,GA-BPNN和GA-SVM,比较。结果表明,耦合模型优于单一模型。此外,由于GA-SVM具有较好的泛化能力和理论基础,其收敛速度和预测精度均优于GA-BPNN。然后利用灰色关联分析(GRA)和Shapley分析验证了GA-SVM模型的可解释性,结果表明,水胶比对抗压强度的影响最大。最后,多输入变量的组合来评估抗压强度,补充了本研究,并再次验证了水胶比的显著影响,为后续研究提供参考价值。
    There is a complex high-dimensional nonlinear mapping relationship between the compressive strength of High-Performance Concrete (HPC) and its components, which has great influence on the accurate prediction of compressive strength. In this paper, an efficient robust software calculation strategy combining BP Neural Network (BPNN), Support Vector Machine (SVM) and Genetic Algorithm (GA) is proposed for the prediction of compressive strength of HPC. 8 features were extracted from the previous literature, and a compressive strength database containing 454 sets of data was constructed. The model was trained and tested, and the performance of 4 Machine Learning (ML) models, namely BPNN, SVM, GA-BPNN and GA-SVM, was compared. The results show that the coupled model is superior to the single model. Moreover, because GA-SVM has better generalization ability and theoretical basis, its convergence speed and prediction accuracy are better than GA-BPNN. Then Grey Relational Analysis (GRA) and Shapley analysis were used to verify the interpretability of the GA-SVM model, which showed that the water-binder ratio had the most significant influence on the compressive strength. Finally, the combination of multiple input variables to evaluate the compressive strength supplemented this research, and again verified the significant influence of water-binder ratio, providing reference value for subsequent research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Letter
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:及时有效地识别患有抑郁症(DS)的个体对于提供及时治疗至关重要。机器学习模型在这一领域表现出了希望;然而,研究往往不足以证明使用这些模型的实际好处,并且无法提供切实的实际应用。
    目的:本研究旨在建立一种新的方法来识别可能表现出DS的个体,通过概率测度以更可解释的方式识别最有影响力的特征,并提出可用于实际应用的工具。
    方法:该研究使用了3个数据集:PROACTIVE,2013年巴西国家健康调查(PesquisaNacionaldeSaúde[PNS])和PNS2019,包括社会人口统计学和健康相关特征。使用贝叶斯网络进行特征选择。然后使用选定的特征来训练机器学习模型以预测DS,在9项患者健康问卷中,评分≥10。与随机方法相比,该研究还分析了不同敏感性率对减少筛选访谈的影响。
    结果:该方法允许用户在灵敏度之间进行明智的权衡,特异性,减少面试次数。在通过最大化Youden指数确定的阈值0.444、0.412和0.472下,模型的灵敏度为0.717、0.741和0.718,特异性为0.644、0.737和0.766,分别为PNS2013和PNS2019。这3个数据集的接收器工作特性曲线下面积分别为0.736、0.801和0.809,分别。对于PROACTIVE数据集,最具影响力的特征是姿势平衡,呼吸急促,以及老年人的感觉。在PNS2013数据集中,特点是能够进行日常活动,胸痛,睡眠问题,和慢性背部问题。PNS2019数据集与PNS2013数据集共享3个最具影响力的特征。然而,不同的是用言语虐待代替了慢性背部问题。重要的是要注意,PNS数据集中包含的特征与PROACTIVE数据集中的特征不同。实证分析表明,使用所提出的模型可导致筛选访谈减少52%,同时保持0.80的敏感性。
    结论:这项研究开发了一种新的方法来识别患有DS的个体,展示了使用贝叶斯网络识别最重要特征的实用性。此外,这种方法有可能大大减少筛选访谈的数量,同时保持高度的敏感性,从而促进改善DS患者的早期识别和干预策略。
    BACKGROUND: Identifying individuals with depressive symptomatology (DS) promptly and effectively is of paramount importance for providing timely treatment. Machine learning models have shown promise in this area; however, studies often fall short in demonstrating the practical benefits of using these models and fail to provide tangible real-world applications.
    OBJECTIVE: This study aims to establish a novel methodology for identifying individuals likely to exhibit DS, identify the most influential features in a more explainable way via probabilistic measures, and propose tools that can be used in real-world applications.
    METHODS: The study used 3 data sets: PROACTIVE, the Brazilian National Health Survey (Pesquisa Nacional de Saúde [PNS]) 2013, and PNS 2019, comprising sociodemographic and health-related features. A Bayesian network was used for feature selection. Selected features were then used to train machine learning models to predict DS, operationalized as a score of ≥10 on the 9-item Patient Health Questionnaire. The study also analyzed the impact of varying sensitivity rates on the reduction of screening interviews compared to a random approach.
    RESULTS: The methodology allows the users to make an informed trade-off among sensitivity, specificity, and a reduction in the number of interviews. At the thresholds of 0.444, 0.412, and 0.472, determined by maximizing the Youden index, the models achieved sensitivities of 0.717, 0.741, and 0.718, and specificities of 0.644, 0.737, and 0.766 for PROACTIVE, PNS 2013, and PNS 2019, respectively. The area under the receiver operating characteristic curve was 0.736, 0.801, and 0.809 for these 3 data sets, respectively. For the PROACTIVE data set, the most influential features identified were postural balance, shortness of breath, and how old people feel they are. In the PNS 2013 data set, the features were the ability to do usual activities, chest pain, sleep problems, and chronic back problems. The PNS 2019 data set shared 3 of the most influential features with the PNS 2013 data set. However, the difference was the replacement of chronic back problems with verbal abuse. It is important to note that the features contained in the PNS data sets differ from those found in the PROACTIVE data set. An empirical analysis demonstrated that using the proposed model led to a potential reduction in screening interviews of up to 52% while maintaining a sensitivity of 0.80.
    CONCLUSIONS: This study developed a novel methodology for identifying individuals with DS, demonstrating the utility of using Bayesian networks to identify the most significant features. Moreover, this approach has the potential to substantially reduce the number of screening interviews while maintaining high sensitivity, thereby facilitating improved early identification and intervention strategies for individuals experiencing DS.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阿尔茨海默病(AD)的发病率在全球范围内呈上升趋势,然而,由于与之相关的复杂病理生理机制,其治疗和预测仍具有挑战性。因此,本研究的目的是分析和表征铁凋亡相关基因(FEGs)在AD发病机理中的分子机制,以及构建预后模型。这些发现将为未来AD的诊断和治疗提供新的见解。首先,获得了来自基因表达综合数据库的AD数据集GSE33000和来自FerrDB的FEGs。接下来,无监督聚类分析用于获得与AD最相关的FEGs。随后,对FEGs进行富集分析以探索生物学功能。随后,通过CIBERSORT阐明了这些基因在免疫微环境中的作用。然后,通过比较不同机器学习模型的性能选择最优机器学习。为了验证预测效率,使用列线图对模型进行了验证,校正曲线,决策曲线分析和外部数据集。此外,使用逆转录定量PCR和Westernblot分析验证不同组间FEGs的表达.在AD中,FEGs表达的改变影响某些免疫细胞的聚集和浸润。这表明AD的发生与免疫浸润密切相关。最后,选择了最合适的机器学习模型,建立AD诊断模型和列线图。本研究提供了新的见解,可以增强对FEGs在AD中作用的分子机制的理解。此外,本研究提供了可能有助于AD诊断的生物标志物.
    The incidence of Alzheimer\'s disease (AD) is rising globally, yet its treatment and prediction of this condition remain challenging due to the complex pathophysiological mechanisms associated with it. Consequently, the objective of the present study was to analyze and characterize the molecular mechanisms underlying ferroptosis‑related genes (FEGs) in the pathogenesis of AD, as well as to construct a prognostic model. The findings will provide new insights for the future diagnosis and treatment of AD. First, the AD dataset GSE33000 from the Gene Expression Omnibus database and the FEGs from FerrDB were obtained. Next, unsupervised cluster analysis was used to obtain the FEGs that were most relevant to AD. Subsequently, enrichment analyses were performed on the FEGs to explore biological functions. Subsequently, the role of these genes in the immune microenvironment was elucidated through CIBERSORT. Then, the optimal machine learning was selected by comparing the performance of different machine learning models. To validate the prediction efficiency, the models were validated using nomograms, calibration curves, decision curve analysis and external datasets. Furthermore, the expression of FEGs between different groups was verified using reverse transcription quantitative PCR and western blot analysis. In AD, alterations in the expression of FEGs affect the aggregation and infiltration of certain immune cells. This indicated that the occurrence of AD is strongly associated with immune infiltration. Finally, the most appropriate machine learning models were selected, and AD diagnostic models and nomograms were built. The present study provided novel insights that enhance understanding with regard to the molecular mechanism of action of FEGs in AD. Moreover, the present study provided biomarkers that may facilitate the diagnosis of AD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    双酚A二缩水甘油醚(BADGE),众所周知的内分泌干扰物双酚A(BPA)的衍生物,由于其作为微污染物的普遍存在,因此对长期环境健康构成潜在威胁。这项研究解决了以前未开发的BADGE毒性和去除领域。我们调查了,第一次,从嗜热地芽孢杆菌中分离的漆酶对BADGE的生物降解潜力。使用响应面方法(RSM)和机器学习模型的组合来优化漆酶介导的降解过程。通过各种技术分析了BADGE的降解,包括紫外-可见分光光度法,高效液相色谱(HPLC),傅里叶变换红外(FTIR)光谱,和气相色谱-质谱(GC-MS)。嗜热脂肪土芽孢杆菌MB600的漆酶在30min内降解率为93.28%,而来自热parafinivorans地芽孢杆菌菌株MB606的漆酶在90分钟内降解达到94%。RSM分析预测最佳降解条件为60min反应时间,温度80°C,和pH4.5。此外,CB-Dock模拟揭示了漆酶和BADGE之间良好的结合相互作用,对于263的腔大小和-5.5的Vina评分选择初始结合模式,这证实了所观察到的漆酶的生物降解潜力。这些发现突出了来自嗜热地芽孢杆菌菌株的漆酶的生物催化潜力,特别是MB600,用于对BADGE污染的环境进行酶净化。
    Bisphenol A diglycidyl ether (BADGE), a derivative of the well-known endocrine disruptor Bisphenol A (BPA), is a potential threat to long-term environmental health due to its prevalence as a micropollutant. This study addresses the previously unexplored area of BADGE toxicity and removal. We investigated, for the first time, the biodegradation potential of laccase isolated from Geobacillus thermophilic bacteria against BADGE. The laccase-mediated degradation process was optimized using a combination of response surface methodology (RSM) and machine learning models. Degradation of BADGE was analyzed by various techniques, including UV-Vis spectrophotometry, high-performance liquid chromatography (HPLC), Fourier transform infrared (FTIR) spectroscopy, and gas chromatography-mass spectrometry (GC-MS). Laccase from Geobacillus stearothermophilus strain MB600 achieved a degradation rate of 93.28% within 30 min, while laccase from Geobacillus thermoparafinivorans strain MB606 reached 94% degradation within 90 min. RSM analysis predicted the optimal degradation conditions to be 60 min reaction time, 80°C temperature, and pH 4.5. Furthermore, CB-Dock simulations revealed good binding interactions between laccase enzymes and BADGE, with an initial binding mode selected for a cavity size of 263 and a Vina score of -5.5, which confirmed the observed biodegradation potential of laccase. These findings highlight the biocatalytic potential of laccases derived from thermophilic Geobacillus strains, notably MB600, for enzymatic decontamination of BADGE-contaminated environments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由于电池退化的非线性特性,实现电池循环寿命的精确估计是一个巨大的挑战。本研究探索了一种使用机器学习(ML)方法来预测具有高质量负载LiNi0.8Mn0.1Co0.1O2电极的基于锂金属的可充电电池的循环寿命的方法,在电池运行条件下,其表现出比通常研究的基于LiFePO/石墨的可充电电池更复杂和电化学特征。从放电中提取不同的特征,charge,和放松过程,在不依赖于特定降解机制的情况下,细胞行为的复杂性被导航。性能最好的ML模型,特征选择后,R2为0.89,展示了ML在准确预测周期寿命中的应用。特征重要性分析揭示了100和10个循环之间放电容量差最小值的对数(Log(|min(ΔDQ100-10(V))|)作为最重要的特征。尽管固有的挑战,该模型在看不见的数据上显示出显着的6.6%的测试误差,强调其在电池管理系统中的鲁棒性和变革性进步的潜力。这项研究有助于ML在具有实际上高能量密度设计的基于锂金属的可充电电池的循环寿命预测领域的成功应用。
    Achieving precise estimates of battery cycle life is a formidable challenge due to the nonlinear nature of battery degradation. This study explores an approach using machine learning (ML) methods to predict the cycle life of lithium-metal-based rechargeable batteries with high mass loading LiNi0.8Mn0.1Co0.1O2 electrode, which exhibits more complicated and electrochemical profile during battery operating conditions than typically studied LiFePO₄/graphite based rechargeable batteries. Extracting diverse features from discharge, charge, and relaxation processes, the intricacies of cell behavior without relying on specific degradation mechanisms are navigated. The best-performing ML model, after feature selection, achieves an R2 of 0.89, showcasing the application of ML in accurately forecasting cycle life. Feature importance analysis unveils the logarithm of the minimum value of discharge capacity difference between 100 and 10 cycle (Log(|min(ΔDQ 100-10(V))|)) as the most important feature. Despite the inherent challenges, this model demonstrates a remarkable 6.6% test error on unseen data, underscoring its robustness and potential for transformative advancements in battery management systems. This study contributes to the successful application of ML in the realm of cycle life prediction for lithium-metal-based rechargeable batteries with practically high energy density design.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    传粉者收集的花粉可以用作觅食行为的标记,也可以指示每种环境中存在的植物物种。花粉摄入对传粉者的健康和生存至关重要。在觅食活动中,一些传粉者,比如蜜蜂,操纵收集的花粉与唾液分泌物和花蜜(花粉)混合,改变花粉化学特征。已经开发了不同的工具来鉴定花粉的植物起源,基于显微镜,光谱学,或分子标记。然而,到目前为止,花粉从未被调查过。在我们的工作中,春季采集了5个不同气候地区的花粉。使用基于显微镜的技术鉴定了花粉,然后用MALDI-MS分析测试了四种不同的化学提取溶液和两种物理破坏方法以实现MALDI-MS有效方案。在用乙酸或三氟乙酸萃取后,使用超声破碎方法获得最佳性能。因此,我们提出了一种新的快速可靠的方法,用于使用MALDI-MS鉴定球花粉的植物起源。这种新方法为从植物生物多样性到生态系统营养相互作用的广泛环境研究打开了大门。
    Pollen collected by pollinators can be used as a marker of the foraging behavior as well as indicate the botanical species present in each environment. Pollen intake is essential for pollinators\' health and survival. During the foraging activity, some pollinators, such as honeybees, manipulate the collected pollen mixing it with salivary secretions and nectar (corbicular pollen) changing the pollen chemical profile. Different tools have been developed for the identification of the botanical origin of pollen, based on microscopy, spectrometry, or molecular markers. However, up to date, corbicular pollen has never been investigated. In our work, corbicular pollen from 5 regions with different climate conditions was collected during spring. Pollens were identified with microscopy-based techniques, and then analyzed in MALDI-MS. Four different chemical extraction solutions and two physical disruption methods were tested to achieve a MALDI-MS effective protocol. The best performance was obtained using a sonication disruption method after extraction with acetic acid or trifluoroacetic acid. Therefore, we propose a new rapid and reliable methodology for the identification of the botanical origin of the corbicular pollens using MALDI-MS. This new approach opens to a wide range of environmental studies spanning from plant biodiversity to ecosystem trophic interactions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在国际上倡导低碳健康的生活方式,环境PM2.5污染给希望从事户外运动和采取积极低碳通勤的城市居民带来了困境。在这项研究中,设计并提出了城市空气健康导航系统(UAHNS),以通过推荐PM2.5暴露最少的路线并基于拓扑数字地图动态发布早期风险警告来帮助用户,应用程序编程接口(API),极限梯度提升(XGBoost)模型,和两步空间插值。在武汉市对UAHNS的功能和应用进行了测试。结果表明,与经过训练的随机森林(RF)相比,LightGBM,Adaboost模型,等。,XGBoost模型表现更好,根据国家空气和气象监测站的数据,R2超过0.90,RMSE约为15.74μg/m3。Further,采用两步空间插值模型,以300m*300m的空间分辨率动态生成污染分布。然后,在武汉随机选择的通勤路线和时间下进行了暴露比较,显示较低PM2.5暴露的推荐途径有效地帮助。骑乘和步行的路线差异率约为14.9%和16.9%,分别。最后,UAHNS平台在武汉整体实现,由实时PM2.5查询组成,任何地点的一小时PM2.5预测功能,城市地图上的健康导航,和个性化的健康信息查询。
    Under international advocacy for a low-carbon and healthy lifestyle, ambient PM2.5 pollution poses a dilemma for urban residents who wish to engage in outdoor exercise and adopt active low-carbon commuting. In this study, an Urban Air Health Navigation System (UAHNS) was designed and proposed to assist users by recommending routes with the least PM2.5 exposure and dynamically issuing early risk warnings based on topologized digital maps, an application programming interface (API), an eXtreme Gradient Boosting (XGBoost) model, and two-step spatial interpolation. A test of the UAHNS\'s functions and applications was carried out in Wuhan city. The results showed that, compared with trained random forest (RF), LightGBM, Adaboost models, etc., the XGBoost model performed better, with an R2 exceeding 0.90 and an RMSE of approximately 15.74 μg/m3, based on data from national air and meteorological monitoring stations. Further, the two-step spatial interpolation model was adopted to dynamically generate pollution distribution at a spatial resolution of 300 m*300 m. Then, an exposure comparison was performed under randomly selected commuting routes and times in Wuhan, showing the recommended routes for lower PM2.5 exposure made effectively help. And the route difference ratios of about 14.9 % and 16.9 % for riding and walking, respectively. Finally, the UAHNS platform was integrally realized for Wuhan, consisting of a real-time PM2.5 query, a one-hour PM2.5 prediction function at any location, health navigation on city map, and a personalized health information query.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号