Machine-learning

机器学习
  • 文章类型: Journal Article
    几十年的运动免疫学研究已经证明了运动对免疫反应的深远影响,影响个体的疾病易感性。运动过程中白细胞(WBC)计数的准确预测可以帮助设计有效的训练计划,以维持最佳的免疫系统功能并防止其抑制。在这方面,这项研究旨在开发一种易于使用且高效的建模工具,用于预测运动期间的WBC计数。为了实现这一目标,一系列机器学习算法的预测能力,包括六个独立型号(M5prime(M5P),随机森林(RF),交替模型树(AMT),减少错误修剪树(REPT),局部加权学习(LWL),和支持向量回归(SVR))与六种类型的混合模型一起进行了评估,这些混合模型使用了装袋(BA)算法(BA-M5P,BA-RF,BA-AMT,BA-REPT,BA-LWL,和BA-SVR)。从200名合格人员中建立了一个综合数据库。采用运动后训练WBC的模型作为输出参数和七个WBC影响因素,包括锻炼的强度和持续时间,运动前训练WBC计数,年龄,身体脂肪百分比,最大有氧能力,和肌肉质量作为输入参数。使用标准统计数据将模型的预测结果与观察到的WBC进行比较,表明BA-M5P模型具有最大的潜力来产生对淋巴细胞数量的稳健预测。中性粒细胞,单核细胞,和WBC相比其他型号。此外,运动前训练WBC计数,运动强度和持续时间以及体脂百分比是预测WBC计数的最重要特征。这些发现对运动免疫学的发展和公共卫生的促进具有重要意义。
    Decades of research in exercise immunology have demonstrated the profound impact of exercise on the immune response, influencing an individual\'s disease susceptibility. Accurate prediction of white blood cells (WBCs) count during exercise can help to design effective training programs to maintain optimal the immune system function and prevent its suppression. In this regard, this study aimed to develop an easy-to-use and efficient modelling tool for predicting WBCs count during exercise. To achieve this goal, the predictive power of a range of machine-learning algorithms, including six standalone models (M5 prime (M5P), random forest (RF), alternating model trees (AMT), reduced error pruning tree (REPT), locally weighted learning (LWL), and support vector regression (SVR)) were assessed along with six types of hybrid models trained with a bagging (BA) algorithm (BA-M5P, BA-RF, BA-AMT, BA-REPT, BA-LWL, and BA- SVR). A comprehensive database was constructed from 200 eligible people. The models employed post-exercise training WBCs counts as the output parameter and seven WBCs-influencing factors, including intensity and duration of exercise, pre-exercise training WBCs counts, age, body fat percentage, maximal aerobic capacity, and muscle mass as input parameters. Comparing the prediction results of the models to the observed WBCs using standard statistics indicated that the BA-M5P model had the greatest potential to produce a robust prediction of the number of lymphocytes, neutrophils, monocytes, and WBC compared to other models. Moreover, pre-exercise training WBCs counts, intensity and duration of exercise and body fat percentage were the most important features in predicting WBCs counts. These findings hold significant implications for the advancement of exercise immunology and the promotion of public health.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    膀胱癌,一种高度致命的疾病,对患者构成重大威胁。位于19q13.2-13.3,LIG1,哺乳动物细胞中的四种DNA连接酶之一,在不同来源的肿瘤细胞中经常被删除。尽管如此,LIG1在BLCA中的确切参与仍然难以捉摸。这项开创性的调查探讨了LIG1对BLCA影响的未知领域。我们的主要目标是阐明LIG1和BLCA之间复杂的相互作用,同时探讨其与各种临床病理因素的相关性。
    我们从GEO存储库中检索了癌旁组织和膀胱癌(BLCA)的基因表达数据。使用“Seurat”包处理单细胞测序数据。然后用“Limma”包进行差异表达分析。使用“WGCNA”软件包实现了无标度基因共表达网络的构建。随后,维恩图用于从WGCNA鉴定的正相关模块中提取基因,并将其与差异表达基因(DEG)相交,分离重叠的基因。“STRINGdb”软件包用于建立蛋白质-蛋白质相互作用(PPI)网络。通过PPI网络使用Betweenness中心性(BC)算法鉴定集线器基因。我们进行了KEGG和GO富集分析,以揭示与hub基因相关的调节机制和生物学功能。使用R包“mlr3verse”建立了机器学习诊断模型。“使用BEST网站可视化LIG1^高和LIG1^低组之间的突变谱。使用BEST网站和GENT2网站进行LIG1^高和LIG1^低组中的生存分析。最后,我们进行了一系列功能实验,以验证LIG1在BLCA中的功能作用.
    我们的调查显示BLCA标本中LIG1的上调,LIG1水平升高与不利的总体生存结局相关。枢纽基因的功能富集分析,GO和KEGG富集分析证明了这一点,强调LIG1参与关键功能,如DNA复制,细胞衰老,细胞周期和p53信号通路。值得注意的是,BLCA的突变景观在LIG1high和LIG1low组之间差异显著。免疫浸润分析表明,LIG1在BLCA微环境中的免疫细胞募集和免疫调节中起着关键作用。从而影响预后。随后的实验验证进一步强调了LIG1在BLCA发病机制中的重要性,巩固其在BLCA样本中的功能相关性。
    我们的研究表明,LIG1在促进膀胱癌恶性进展中起着至关重要的作用,入侵,EMT,和其他关键功能,从而充当潜在的风险生物标志物。
    UNASSIGNED: Bladder cancer, a highly fatal disease, poses a significant threat to patients. Positioned at 19q13.2-13.3, LIG1, one of the four DNA ligases in mammalian cells, is frequently deleted in tumour cells of diverse origins. Despite this, the precise involvement of LIG1 in BLCA remains elusive. This pioneering investigation delves into the uncharted territory of LIG1\'s impact on BLCA. Our primary objective is to elucidate the intricate interplay between LIG1 and BLCA, alongside exploring its correlation with various clinicopathological factors.
    UNASSIGNED: We retrieved gene expression data of para-carcinoma tissues and bladder cancer (BLCA) from the GEO repository. Single-cell sequencing data were processed using the \"Seurat\" package. Differential expression analysis was then performed with the \"Limma\" package. The construction of scale-free gene co-expression networks was achieved using the \"WGCNA\" package. Subsequently, a Venn diagram was utilized to extract genes from the positively correlated modules identified by WGCNA and intersect them with differentially expressed genes (DEGs), isolating the overlapping genes. The \"STRINGdb\" package was employed to establish the protein-protein interaction (PPI) network.Hub genes were identified through the PPI network using the Betweenness Centrality (BC) algorithm. We conducted KEGG and GO enrichment analyses to uncover the regulatory mechanisms and biological functions associated with the hub genes. A machine-learning diagnostic model was established using the R package \"mlr3verse.\" Mutation profiles between the LIG1^high and LIG1^low groups were visualized using the BEST website. Survival analyses within the LIG1^high and LIG1^low groups were performed using the BEST website and the GENT2 website. Finally, a series of functional experiments were executed to validate the functional role of LIG1 in BLCA.
    UNASSIGNED: Our investigation revealed an upregulation of LIG1 in BLCA specimens, with heightened LIG1 levels correlating with unfavorable overall survival outcomes. Functional enrichment analysis of hub genes, as evidenced by GO and KEGG enrichment analyses, highlighted LIG1\'s involvement in critical function such as the DNA replication, cellular senescence, cell cycle and the p53 signalling pathway. Notably, the mutational landscape of BLCA varied significantly between LIG1high and LIG1low groups.Immune infiltrating analyses suggested a pivotal role for LIG1 in immune cell recruitment and immune regulation within the BLCA microenvironment, thereby impacting prognosis. Subsequent experimental validations further underscored the significance of LIG1 in BLCA pathogenesis, consolidating its functional relevance in BLCA samples.
    UNASSIGNED: Our research demonstrates that LIG1 plays a crucial role in promoting bladder cancer malignant progression by heightening proliferation, invasion, EMT, and other key functions, thereby serving as a potential risk biomarker.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:枪支伤害构成公共卫生危机。在医疗保健层面,他们是,然而,罕见事件。
    目的:开发一种预测模型,以识别枪支伤害风险增加的成年患者的医疗保健情况,以进行目标筛查和预防工作。
    方法:使用来自KaiserPermanenteSouthernCalifornia(KPSC)的电子健康记录数据来识别致命和非致命枪支伤害患者的医疗服务。以及2010-2018年匹配对照样本的医疗保健访问。超过170个预测因子,包括诊断,医疗保健利用,并确定了邻里特征。使用极端梯度增强(XGBoost)和分体式样本设计来训练和测试模型,该模型预测了在未来3年内遭遇枪支伤害的风险。
    结果:在5.288.529KPSC成年成员中,共发现3879起枪支伤害。医疗保健水平的患病率为0.01%。15个最重要的预测因素包括人口统计学,医疗保健利用,和邻里层面的社会经济因素。最终模型的敏感性和特异性分别为0.83和0.56。非常高风险组(预测风险的前1%)的阳性预测值为0.14%,敏感性为13%。与普遍筛查相比,这种高危人群可能将筛查负担降低11.7倍。给出了替代概率截止值的结果。
    结论:我们的模型可以支持医疗保健环境中更有针对性的筛查,提高了枪支伤害风险评估和预防工作的效率。
    OBJECTIVE: Firearm injuries constitute a public health crisis. At the healthcare encounter level, they are, however, rare events.
    OBJECTIVE: To develop a predictive model to identify healthcare encounters of adult patients at increased risk of firearm injury to target screening and prevention efforts.
    METHODS: Electronic health records data from Kaiser Permanente Southern California (KPSC) were used to identify healthcare encounters of patients with fatal and non-fatal firearm injuries, as well as healthcare visits of a sample of matched controls during 2010-2018. More than 170 predictors, including diagnoses, healthcare utilization, and neighborhood characteristics were identified. Extreme gradient boosting (XGBoost) and a split sample design were used to train and test a model that predicted risk of firearm injury within the next 3 years at the encounter level.
    RESULTS: A total of 3879 firearm injuries were identified among 5 288 529 KPSC adult members. Prevalence at the healthcare encounter level was 0.01%. The 15 most important predictors included demographics, healthcare utilization, and neighborhood-level socio-economic factors. The sensitivity and specificity of the final model were 0.83 and 0.56, respectively. A very high-risk group (top 1% of predicted risk) yielded a positive predictive value of 0.14% and sensitivity of 13%. This high-risk group potentially reduces screening burden by a factor of 11.7, compared to universal screening. Results for alternative probability cutoffs are presented.
    CONCLUSIONS: Our model can support more targeted screening in healthcare settings, resulting in improved efficiency of firearm injury risk assessment and prevention efforts.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:这项研究旨在通过机器学习算法使用妊娠和分娩风险因素来预测青春期早期睡眠问题,并在内部和外部评估模型性能。
    方法:采用中国金坛儿童队列研究(CJCC;n=848)模型开发和美国健康大脑和行为研究(HBBS;n=454)外部验证的数据。产妇怀孕史,产科数据,收集了青少年睡眠问题。采用了几种机器学习技术,包括最小绝对收缩和选择运算符,逻辑回归,随机森林,天真的贝叶斯,极端梯度增强,决策树,和神经网络。接收器工作特性曲线下的面积,灵敏度,特异性,准确度,残差均方根用于评估模型性能。
    结果:CJCC青少年睡眠问题的主要预测因素包括胎龄,出生体重,交货期限,和母亲在怀孕期间的幸福。在HBBS青少年中,产后抑郁情绪持续时间是围产期的主要预测因素.CJCC中开发的预测模型具有良好到出色的内部验证性能,但在预测HBBS青少年的睡眠问题方面的性能较差。
    结论:确定与青少年睡眠问题相关的特定围产期危险因素可以在怀孕期间和怀孕后提供有针对性的干预措施,以减轻这些风险。卫生提供者应考虑将这些预测因素纳入常规的产前和产后评估,以确定高危人群。不同队列模型性能的可变性突出了对特定于上下文的模型的需求,以及在不同人群中谨慎应用预测分析。未来的研究应该集中在完善预测模型上,以解释这种变化,可能通过纳入其他社会文化因素和遗传标记。这项研究强调了个性化和文化敏感方法在青少年睡眠问题的预测和管理中的重要性。利用先进的计算方法来增强母婴健康结果。
    BACKGROUND: This study aimed to predict early adolescent sleep problems using pregnancy and childbirth risk factors through machine learning algorithms, and to evaluate model performance internally and externally.
    METHODS: Data from the China Jintan Child Cohort study (CJCC; n=848) for model development and the US Healthy Brain and Behavior Study (HBBS; n=454) for external validation were employed. Maternal pregnancy histories, obstetric data, and adolescent sleep problems were collected. Several machine learning techniques were employed, including least absolute shrinkage and selection operator, logistic regression, random forest, naïve bayes, extreme gradient boosting, decision tree, and neural network. The area under the receiver operating characteristic curve, sensitivity, specificity, accuracy, and root mean square of residuals were used to evaluate model performance.
    RESULTS: Key predictors for CJCC adolescents\' sleep problems include gestational age, birthweight, duration of delivery, and maternal happiness during pregnancy. In HBBS adolescents, the duration of postnatal depressive emotions was the primary perinatal predictor. The prediction models developed in the CJCC had good-to-excellent internal validation performance but poor performance in predicting the sleep problems in HBBS adolescents.
    CONCLUSIONS: The identification of specific perinatal risk factors associated with adolescent sleep problems can inform targeted interventions during and after pregnancy to mitigate these risks. Health providers should consider integrating these predictive factors into routine pre- and postnatal assessments to identify at-risk populations. The variability in model performance across different cohorts highlights the need for context-specific models and the cautious application of predictive analytics across diverse populations. Future research should focus on refining predictive models to account for such variations, potentially through the incorporation of additional socio-cultural factors and genetic markers. This study emphasizes the importance of personalized and culturally sensitive approaches in the prediction and management of adolescent sleep problems, leveraging advanced computational methods to enhance maternal and child health outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Enhlink是一种用于scATAC-seq数据分析的计算工具,促进在单细胞水平上对增强子功能的精确询问。它采用整合技术和生物协变量的集成方法来推断条件特异性调节DNA连接。Enhlink可以整合多维数据以增强特异性,可用时。用模拟和真实数据进行评估,包括来自小鼠纹状体的多组数据集和新的启动子捕获Hi-C数据,证明Enhlink优于替代方法。再加上eQTL分析,它在纹状体神经元中发现了一种推定的超级增强子。总的来说,增强链接提供准确性,电源,以及揭示基因调控新生物学见解的潜力。
    Enhlink is a computational tool for scATAC-seq data analysis, facilitating precise interrogation of enhancer function at the single-cell level. It employs an ensemble approach incorporating technical and biological covariates to infer condition-specific regulatory DNA linkages. Enhlink can integrate multi-omic data for enhanced specificity, when available. Evaluation with simulated and real data, including multi-omic datasets from the mouse striatum and novel promoter capture Hi-C data, demonstrate that Enhlink outperfoms alternative methods. Coupled with eQTL analysis, it identified a putative super-enhancer in striatal neurons. Overall, Enhlink offers accuracy, power, and potential for revealing novel biological insights in gene regulation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:腋窝淋巴结清扫术(ALND)是具有三个或更多前哨淋巴结(SLN)阳性的早期乳腺癌(BC)患者的标准程序。然而,ALND可导致显著的术后并发症,而不总是提供额外的临床益处。这项研究旨在开发机器学习(ML)模型,以预测具有三个或更多阳性SLN的中国BC患者的非前哨淋巴结(non-SLN)转移。可能允许遗漏ALND。
    方法:对汕头大学医学院2217例接受SLN活检的BC患者资料进行分析,634具有正SLN。患者分为≤2个阳性SLN和≥3个阳性SLN。我们应用了9种ML算法来预测非SLN转移。使用ROC曲线评估模型性能,精确度-召回曲线,和校准曲线。决策曲线分析(DCA)评估了模型的临床实用性。
    结果:RF模型显示出优越的预测性能,训练集中的AUC为0.987,验证集中的AUC为0.828。关键预测特征包括阳性SLN的大小,肿瘤大小,SLN的数量,和ER状态。在外部验证中,RF模型的AUC为0.870,显示出强大的预测能力。
    结论:开发的RF模型可以准确预测SLN≥3个阳性的BC患者的非SLN转移,这表明ALND可以在选定的患者中通过应用额外的腋窝放疗来避免。这种方法可以降低术后并发症的发生率,提高患者的生活质量。有必要在前瞻性临床试验中进一步验证。
    BACKGROUND: Axillary lymph node dissection (ALND) is a standard procedure for early-stage breast cancer (BC) patients with three or more positive sentinel lymph nodes (SLNs). However, ALND can lead to significant postoperative complications without always providing additional clinical benefits. This study aims to develop machine-learning (ML) models to predict non-sentinel lymph node (non-SLN) metastasis in Chinese BC patients with three or more positive SLNs, potentially allowing the omission of ALND.
    METHODS: Data from 2217 BC patients who underwent SLN biopsy at Shantou University Medical College were analyzed, with 634 having positive SLNs. Patients were categorized into those with ≤ 2 positive SLNs and those with ≥ 3 positive SLNs. We applied nine ML algorithms to predict non-SLN metastasis. Model performance was evaluated using ROC curves, precision-recall curves, and calibration curves. Decision Curve Analysis (DCA) assessed the clinical utility of the models.
    RESULTS: The RF model showed superior predictive performance, achieving an AUC of 0.987 in the training set and 0.828 in the validation set. Key predictive features included size of positive SLNs, tumor size, number of SLNs, and ER status. In external validation, the RF model achieved an AUC of 0.870, demonstrating robust predictive capabilities.
    CONCLUSIONS: The developed RF model accurately predicts non-SLN metastasis in BC patients with ≥ 3 positive SLNs, suggesting that ALND might be avoided in selected patients by applying additional axillary radiotherapy. This approach could reduce the incidence of postoperative complications and improve patient quality of life. Further validation in prospective clinical trials is warranted.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    断奶后阶段的死亡率是猪生产系统性能的关键指标,受流行病学三合会多种因素复杂相互作用的影响。这项研究利用了在美国猪生产系统中销售的1723组猪的回顾性数据,使用机器学习技术开发了Wean质量评分(WQS)。该研究评估了三种机器学习模型,随机森林,支持向量机,和梯度增压机,对60天死亡率高或低的群体进行分类,在研究人群中,高死亡率组占死亡率最高的组的25%(n=431;60天死亡率=9.98%),其余75%的组均为低死亡率(n=1292;60日死亡率=2.75%).表现最好的模型,随机森林(RF),在准确性(0.90)方面优于其他ML模型,灵敏度(0.84),和特异性(0.92)指标,然后被选中进行进一步分析,其中包括创建WQS并将最重要的因素分类为高死亡率或低死亡率。通过RF模型对高死亡率组进行分类的最重要因素是断奶前死亡率,断奶年龄,母猪养殖场垃圾的平均均等,和PRRS状态。此外,放养条件,例如放养密度和填充谷仓的时间是高死亡率的重要预测因素。WQS与实际60天死亡率相关(r=0.74)。提供了一个有价值的工具,用于评估断奶前猪生产系统的断奶后生存能力。这项研究强调了机器学习和综合数据利用的潜力,以改善商品猪生产中断奶猪质量的评估和管理,生产者可以利用它来识别和干预群体,根据WQS。
    Mortality during the post-weaning phase is a critical indicator of swine production system performance, influenced by a complex interaction of multiple factors of the epidemiological triad. This study leveraged retrospective data from 1723 groups of pigs marketed within a US swine production system to develop a Wean-Quality Score (WQS) using machine learning techniques. The study evaluated three machine learning models, Random Forest, Support Vector Machine, and Gradient Boosting Machine, to classify groups having high or low 60-day mortality, where high mortality groups represented 25 % of the groups among the study population with the highest mortality values (n=431; 60-day mortality=9.98 %), and the remaining 75 % of the groups were of low mortality (n=1292; 60-day mortality=2.75 %). The best-performing model, Random Forest (RF), outperformed the other ML models in terms of accuracy (0.90), sensitivity (0.84), and specificity (0.92) metrics, and was then selected for further analysis, which consisted of creating the WQS and ranking the most important factors for classifying groups as high or low mortality. The most important factors ranked through the RF model to classify groups with high mortality were pre-weaning mortality, weaning age, average parity of litters in sow farms, and PRRS status. Additionally, stocking conditions such as stocking density and time to fill the barn were important predictors of high mortality. The WQS was developed and correlated (r = 0.74) with the actual 60-day mortality of the groups, offering a valuable tool for assessing post-weaning survivability in swine production systems before weaning. This study highlights the potential of machine learning and comprehensive data utilization to improve the assessment and management of weaned pig quality in commercial swine production, which producers can utilize to identify and intervene in groups, according to the WQS.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    密码子使用偏差,或者同义密码子的不平等使用,在整个基因中观察到,基因组,物种之间。它与许多细胞功能有关,如翻译动态性和转录稳定性,但也可以由中立力塑造。我们对来自真菌亚门酵母的1,051个物种的1,154个菌株的密码子使用进行了表征,以了解这些偏见,分子机制,进化,和基因组特征有助于密码子使用模式。我们发现A/T结尾密码子的普遍偏好和密码子使用偏好之间的相关性,GC含量,和tRNA-ome大小。密码子使用偏差在12个订单之间是不同的,以至于可以使用机器学习算法以大于90%的准确度对酵母进行分类。我们还表征了密码子使用偏好受翻译选择影响的程度。我们发现它受到多种特征的影响,包括编码序列的数量,Bucco伯爵,和基因组长度。我们的分析还揭示了酵母密码子使用的极端偏倚性,与缺乏解码CGN密码子的预测精氨酸tRNA有关。只留下AGN密码子来编码精氨酸。酵母菌的基因表达分析,tRNA序列,和密码子进化表明,避免CGN密码子与精氨酸tRNA功能的下降有关。与以前的发现一致,酵母中的密码子使用偏差是由基因组特征和GC偏差决定的。然而,我们发现了沿着酵母谱系的极端密码子使用偏好和回避的情况,表明额外的力量可能正在塑造特定密码子的进化。
    Codon usage bias, or the unequal use of synonymous codons, is observed across genes, genomes, and between species. It has been implicated in many cellular functions, such as translation dynamics and transcript stability, but can also be shaped by neutral forces. We characterized codon usage across 1,154 strains from 1,051 species from the fungal subphylum Saccharomycotina to gain insight into the biases, molecular mechanisms, evolution, and genomic features contributing to codon usage patterns. We found a general preference for A/T-ending codons and correlations between codon usage bias, GC content, and tRNA-ome size. Codon usage bias is distinct between the 12 orders to such a degree that yeasts can be classified with an accuracy greater than 90% using a machine-learning algorithm. We also characterized the degree to which codon usage bias is impacted by translational selection. We found it was influenced by a combination of features, including the number of coding sequences, BUSCO count, and genome length. Our analysis also revealed an extreme bias in codon usage in the Saccharomycodales associated with a lack of predicted arginine tRNAs that decode CGN codons, leaving only the AGN codons to encode arginine. Analysis of Saccharomycodales gene expression, tRNA sequences, and codon evolution suggests that avoidance of the CGN codons is associated with a decline in arginine tRNA function. Consistent with previous findings, codon usage bias within the Saccharomycotina is shaped by genomic features and GC bias. However, we find cases of extreme codon usage preference and avoidance along yeast lineages, suggesting additional forces may be shaping the evolution of specific codons.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    人工智能(AI),特别是机器学习(ML),因其在各个领域的潜力而受到关注。然而,将符号人工智能与知识图谱上的机器学习集成的方法还没有得到显著的关注。我们认为,在进行ML时利用RDF/OWL语义可以提供有用的见解。我们使用Reactome数据库中的信号通路来探索药物安全性。有希望的结果表明,需要进一步调查和与领域专家合作。
    Artificial Intelligence (AI), particularly Machine Learning (ML), has gained attention for its potential in various domains. However, approaches integrating symbolic AI with ML on Knowledge Graphs have not gained significant focus yet. We argue that exploiting RDF/OWL semantics while conducting ML could provide useful insights. We present a use case using signaling pathways from the Reactome database to explore drug safety. Promising outcomes suggest the need for further investigation and collaboration with domain experts.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    腹泻病,其特点是发病率和死亡率高,仍然是一个严重的公共卫生问题,尤其是在埃塞俄比亚等发展中国家。它给这些国家带来的巨大负担强调了确定腹泻预测因子的重要性。在埃塞俄比亚的阿姆哈拉地区,使用机器学习技术来确定5岁以下儿童腹泻的重要预测因素并没有得到很好的记录。因此,本研究旨在澄清这些问题。
    本研究的数据来自埃塞俄比亚人口与健康调查。我们已经应用了机器学习集成分类器模型,如随机森林,逻辑回归,K-最近的邻居,决策树,支持向量机,梯度增强,和朴素贝叶斯模型预测埃塞俄比亚5岁以下儿童腹泻的决定因素。最后,进行Shapley加性扩张(SHAP)值分析以预测腹泻。
    在使用的七种型号中,随机森林算法预测腹泻病的准确率最高,准确率为81.03%,曲线下面积为86.50%。调查了以下因素:财富状况最富有的家庭(对数奇数为-0.04),无急性呼吸道感染(ARIs)病史的儿童(-0.08的对数奇数),没有工作的母亲(日志奇数-0.04),年龄在23至36个月之间的儿童(对数奇数为-0.03),受过高等教育的母亲(对数比值比为-0.03),城市居民(对数奇数为-0.01),使用电力作为烹饪材料的家庭(对数为-0.12),生活在埃塞俄比亚阿姆哈拉地区的5岁以下儿童没有消瘦的迹象,与同龄人不同,未服用肠道寄生虫药物的5岁以下儿童与腹泻病显著相关.
    我们建议在阿姆哈拉地区实施减少5岁以下儿童腹泻发生率的计划。这些计划应侧重于消除阻碍母亲获得财富的社会经济障碍,良好的工作环境,烹饪燃料,教育,和他们的孩子的医疗保健。
    UNASSIGNED: Diarrheal disease, characterized by high morbidity and mortality rates, continues to be a serious public health concern, especially in developing nations such as Ethiopia. The significant burden it imposes on these countries underscores the importance of identifying predictors of diarrhea. The use of machine learning techniques to identify significant predictors of diarrhea in children under the age of 5 in Ethiopia\'s Amhara Region is not well documented. Therefore, this study aimed to clarify these issues.
    UNASSIGNED: This study\'s data have been extracted from the Ethiopian Population and Health Survey. We have applied machine learning ensemble classifier models such as random forests, logistic regression, K-nearest neighbors, decision trees, support vector machines, gradient boosting, and naive Bayes models to predict the determinants of diarrhea in children under the age of 5 in Ethiopia. Finally, Shapley Additive exPlanation (SHAP) value analysis was performed to predict diarrhea.
    UNASSIGNED: Among the seven models used, the random forest algorithm showed the highest accuracy in predicting diarrheal disease with an accuracy rate of 81.03% and an area under the curve of 86.50%. The following factors were investigated: families who had richest wealth status (log odd of -0.04), children without a history of Acute Respiratory Infections (ARIs) (log odd of -0.08), mothers who did not have a job (log odd of -0.04), children aged between 23 and 36 months (log odd of -0.03), mothers with higher education (log odds ratio of -0.03), urban dwellers (log odd of -0.01), families using electricity as cooking material (log odd of -0.12), children under 5 years of age living in the Amhara region of Ethiopia who did not show signs of wasting, children under 5 years of age who had not taken medications for intestinal parasites unlike their peers and who showed a significant association with diarrheal disease.
    UNASSIGNED: We recommend implementing programs to reduce the incidence of diarrhea in children under the age of 5 in the Amhara region. These programs should focus on removing socioeconomic barriers that impede mothers\' access to wealth, a favorable work environment, cooking fuel, education, and healthcare for their children.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号