Machine learning strategy

  • 文章类型: English Abstract
    通过利用基因表达综合(GEO)数据库结合机器学习,筛选骨关节炎(OA)特征性的长链非编码RNA(lncRNA)分子标记。
    185名OA患者和76名健康个体作为正常对照的样本被包括在研究中。针对差异表达的lncRNA筛选GEO数据集。三种算法,最小绝对收缩和选择运算符(LASSO),支持向量机递归特征消除(SVM-RFE),和随机森林(RF),用于筛选候选lncRNA模型,并绘制受试者工作特征(ROC)曲线以评估模型。我们收集了30例临床OA患者和15例健康对照者的外周血样本,并测量了免疫炎症指标。进行RT-PCR以定量分析外周血单核细胞(PBMC)中lncRNA分子标志物的表达。进行Pearson分析以检查lncRNA与免疫系统炎症指标之间的相关性。
    用LASSO共鉴定了14个关键标记,用SVM-RFE鉴定了6个基因,用RF鉴定了24个基因。维恩图用于筛选用三种算法鉴定的重叠基因,显示HOTAIR,H19,MIR155HG,和NKILA是重叠的基因。ROC曲线显示这四种lncRNA均具有大于0.7的曲线下面积(AUC)。RT-PCR结果显示HOTAIR的表达相对升高,与正常组比较,OA患者PBMC中H19、MIR155HG和NKILA的表达降低(P<0.01)。结果与生物信息学预测一致。Pearson分析显示候选lncRNAs与炎症的临床指标相关。
    HOTAIR,H19,MIR155HG,和NKILA可用作OA临床诊断的分子标志物,并且与免疫系统炎症的临床指标相关。
    UNASSIGNED: To screen for long non-coding RNA (lncRNA) molecular markers characteristic of osteoarthritis (OA) by utilizing the Gene Expression Omnibus (GEO) database combined with machine learning.
    UNASSIGNED: The samples of 185 OA patients and 76 healthy individuals as normal controls were included in the study. GEO datasets were screened for differentially expressed lncRNAs. Three algorithms, the least absolute shrinkage and selection operator (LASSO), support vector machine recursive feature elimination (SVM-RFE), and random forest (RF), were used to screen for candidate lncRNA models and receiver operating characteristic (ROC) curves were plotted to evaluate the models. We collected the peripheral blood samples of 30 clinical OA patients and 15 health controls and measured the immunoinflammatory indicators. RT-PCR was performed for quantitative analysis of the expression of lncRNA molecular markers in peripheral blood mononuclear cells (PBMC). Pearson analysis was performed to examine the correlation between lncRNA and indicators for inflammation of the immune system.
    UNASSIGNED: A total of 14 key markers were identified with LASSO, 6 genes were identified with SVM-RFE, and 24 genes were identified with RF. Venn diagram was used to screen for overlapping genes identified with the three algorithms, showing HOTAIR, H19, MIR155 HG, and NKILA to be the overlapping genes. The ROC curves showed that these four lncRNAs all had an area under the curve ( AUC) greater than 0.7. The RT-PCR findings revealed relatively elevated expression of HOTAIR, H19, and MIR155HG and decreased expression of NKILA in the PBMC of OA patients compared with those of the normal group ( P<0.01). The results were consistent with the bioinformatics predictions. Pearson analysis showed that the candidate lncRNAs were correlated with clinical indicators for inflammation.
    UNASSIGNED: HOTAIR, H19, MIR155 HG, and NKILA can be used as molecular markers for the clinical diagnosis of OA and are correlate with clinical indicators of inflammation of the immune system.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    由于其不可生物降解的性质,大规模使用石油化学基塑料已被证明是环境污染的重要来源。基于微生物的酶,如酯酶,角质,和脂肪酶已经显示出降解合成塑料的能力。然而,酶对塑料的降解主要受到缺乏强大的酶系统的限制,即,对塑料降解的低活性和稳定性。最近,涉及基于结构和深度神经网络的机器学习策略显示出产生功能的理想潜力,主动稳定,和耐受聚对苯二甲酸乙二醇酯(PET)降解酶(FAST-PETase)。FAST-PETase在已知的酶或其变体中显示出最高的PET水解活性,并且降解了广泛的塑料。开发基于闭环循环经济的系统,通过FAST-PETase将塑料降解为单体,然后将单体重新聚合为清洁塑料,可以是一种更可持续的方法。作为合成塑料的替代品,不同的微生物可以产生聚羟基链烷酸酯,它们被微生物降解已经得到了很好的证实。本文讨论了塑料酶降解可持续发展的最新进展。
    The large-scale usage of petro-chemical-based plastics has proved to be a significant source of environmental pollution due to their non-biodegradable nature. Microbes-based enzymes such as esterases, cutinases, and lipases have shown the ability to degrade synthetic plastic. However, the degradation of plastics by enzymes is primarily limited by the unavailability of a robust enzymatic system, i.e., low activity and stability towards plastic degradation. Recently, the machine learning strategy involved structure-based and deep neural networks show desirable potential to generate functional, active stable, and tolerant polyethylene terephthalate (PET) degrading enzyme (FAST-PETase). FAST-PETase showed the highest PET hydrolytic activity among known enzymes or their variants and degraded broad ranges of plastics. The development of a closed-loop circular economy-based system of plastic degradation to monomers by FAST-PETase followed by the re-polymerization of monomers into clean plastics can be a more sustainable approach. As an alternative to synthetic plastics, diverse microbes can produce polyhydroxyalkanoates, and their degradation by microbes has been well-established. This article discusses recent updates in the enzymatic degradation of plastics for sustainable development.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    未经证实:糖尿病肾病(DKD)是全球慢性肾病和终末期肾病的主要原因。早期诊断对于防止其进展至关重要。这项研究的目的是确定DKD的潜在诊断生物标志物,说明与生物标志物相关的生物学过程,并研究它们与免疫细胞浸润之间的关系。
    UNASSIGNED:从DKD和对照获得的样品的基因表达谱(GSE30528,GSE96804和GSE99339)从基因表达综合数据库下载作为训练集,并下载基因表达谱(GSE47185和GSE30122)作为验证集。使用训练集鉴定差异表达基因(DEGs),并进行功能相关分析。最小绝对收缩和选择运算符(LASSO),支持向量机-递归特征消除(SVM-RFE),和随机森林(RF)进行鉴定潜在的诊断生物标志物。为了评估这些潜在生物标志物的诊断功效,分别为训练集和验证集绘制受试者工作特性(ROC)曲线,在DKD和对照肾组织中进行生物标志物的免疫组织化学(IHC)染色。此外,CIBERSORT,采用XCELL和TIMER算法评估DKD中免疫细胞的浸润,并且还研究了生物标志物与浸润免疫细胞之间的关系。
    未经授权:总共确认了95个DEG。使用三种机器学习算法,DUSP1和PRKAR2B被鉴定为诊断DKD的潜在生物标志物基因。使用训练集(分别为0.945和0.932)和验证集(分别为0.789和0.709)的ROC分析中的曲线下面积评估DUSP1和PRKAR2B的诊断功效。IHC染色提示DKD患者中DUSP1和PRKAR2B的表达水平明显低于正常水平。免疫细胞浸润分析显示,B记忆细胞,γδT细胞,巨噬细胞,而中性粒细胞可能参与了DKD的发生发展。此外,这两个候选基因在不同程度上与这些免疫细胞亚型相关。
    未经证实:DUSP1和PRKAR2B是DKD的潜在诊断标记物,它们与免疫细胞浸润密切相关。
    UNASSIGNED: Diabetic kidney disease (DKD) is the leading cause of chronic kidney disease and end-stage renal disease worldwide. Early diagnosis is critical to prevent its progression. The aim of this study was to identify potential diagnostic biomarkers for DKD, illustrate the biological processes related to the biomarkers and investigate the relationship between them and immune cell infiltration.
    UNASSIGNED: Gene expression profiles (GSE30528, GSE96804, and GSE99339) for samples obtained from DKD and controls were downloaded from the Gene Expression Omnibus database as a training set, and the gene expression profiles (GSE47185 and GSE30122) were downloaded as a validation set. Differentially expressed genes (DEGs) were identified using the training set, and functional correlation analyses were performed. The least absolute shrinkage and selection operator (LASSO), support vector machine-recursive feature elimination (SVM-RFE), and random forests (RF) were performed to identify potential diagnostic biomarkers. To evaluate the diagnostic efficacy of these potential biomarkers, receiver operating characteristic (ROC) curves were plotted separately for the training and validation sets, and immunohistochemical (IHC) staining for biomarkers was performed in the DKD and control kidney tissues. In addition, the CIBERSORT, XCELL and TIMER algorithms were employed to assess the infiltration of immune cells in DKD, and the relationships between the biomarkers and infiltrating immune cells were also investigated.
    UNASSIGNED: A total of 95 DEGs were identified. Using three machine learning algorithms, DUSP1 and PRKAR2B were identified as potential biomarker genes for the diagnosis of DKD. The diagnostic efficacy of DUSP1 and PRKAR2B was assessed using the areas under the curves in the ROC analysis of the training set (0.945 and 0.932, respectively) and validation set (0.789 and 0.709, respectively). IHC staining suggested that the expression levels of DUSP1 and PRKAR2B were significantly lower in DKD patients compared to normal. Immune cell infiltration analysis showed that B memory cells, gamma delta T cells, macrophages, and neutrophils may be involved in the development of DKD. Furthermore, both of the candidate genes are associated with these immune cell subtypes to varying extents.
    UNASSIGNED: DUSP1 and PRKAR2B are potential diagnostic markers of DKD, and they are closely associated with immune cell infiltration.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Half of the patients with heart failure (HF) have preserved ejection fraction (HFpEF). To date, there are no specific markers to distinguish this subgroup. The main objective of this work was to stratify HF patients using current biochemical markers coupled with clinical data. The cohort study included HFpEF (n = 24) and heart failure with reduced ejection fraction (HFrEF) (n = 34) patients as usually considered in clinical practice based on cardiac imaging (EF ≥ 50% for HFpEF; EF < 50% for HFrEF). Routine blood tests consisted of measuring biomarkers of renal and heart functions, inflammation, and iron metabolism. A multi-test approach and analysis of peripheral blood samples aimed to establish a computerized Machine Learning strategy to provide a blood signature to distinguish HFpEF and HFrEF. Based on logistic regression, demographic characteristics and clinical biomarkers showed no statistical significance to differentiate the HFpEF and HFrEF patient subgroups. Hence a multivariate factorial discriminant analysis, performed blindly using the data set, allowed us to stratify the two HF groups. Consequently, a Machine Learning (ML) strategy was developed using the same variables in a genetic algorithm approach. ML provided very encouraging explorative results when considering the small size of the samples applied. The accuracy and the sensitivity were high for both validation and test groups (69% and 100%, 64% and 75%, respectively). Sensitivity was 100% for the validation and 75% for the test group, whereas specificity was 44% and 55% for the validation and test groups because of the small number of samples. Lastly, the precision was acceptable, with 58% in the validation and 60% in the test group. Combining biochemical and clinical markers is an excellent entry to develop a computer classification tool to diagnose HFpEF. This translational approach is a springboard for improving new personalized treatment methods and identifying \"high-yield\" populations for clinical trials.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    已经观察到体细胞突变在肿瘤发展过程中的因果作用及其与基因表达谱的相互关系,这对决定癌症等级和总体生存率起着重要作用。准确、可靠地预测肿瘤分级和患者总生存期对预后很重要。危险因素识别和治疗策略的改进,特别是对于高致死性肿瘤,比如神经胶质瘤.这里,在更准确和广泛使用的基于机器学习的方法的帮助下,我们提出了一个整合的计算管道,该管道结合了体细胞突变和基因表达谱,以预测神经胶质瘤患者的生存和分级,并同时将其与要施用的药物相关联。这项研究使我们清楚地认识到,如果基因突变不同,相同的药物对治疗相同级别的癌症无效。特定基因的改变在肿瘤进展中起着非常重要的作用,也应考虑选择合适的药物。该提议的框架包括增强治疗设计所需的所有必要因素,并且对于临床医生在为患有不同威胁生命的疾病的个体患者确定准确和个性化的治疗策略方面可能是有用的。
    The causal role of somatic mutation and its interrelationship with gene expression profile during tumor development has already been observed, which plays a major role to decide the cancer grades and overall survival. Accurate and robust prediction of tumor grades and patients\' overall survival are important for prognosis, risk factors identification and betterment of the treatment strategy, especially for highly lethal tumors, like gliomas. Here, with the help of more accurate and widely used machine learning-based approaches, we propose an integrative computational pipeline that incorporates somatic mutations and gene expression profile for survival and grade prediction of glioma patients and simultaneously relates it to the drugs to be administered. This study gives us a clear understanding that the same drug is not effective for the treatment of same grade of cancer if the gene mutations are different. The alteration in a specific gene plays a very important role in tumor progression and should also be considered for the selection of appropriate drugs. This proposed framework includes all the necessary factors required for enhancement of therapeutic designs and could be useful for clinicians in determining an accurate and personalized treatment strategy for individual patients suffering from different life threatening diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号