databases, genetic

数据库,遗传
  • 文章类型: Journal Article
    轮廓隐马尔可夫模型(pHMMs)能够在远程同源搜索中实现高灵敏度,使它们成为检测宏基因组数据中新颖或高度分歧的病毒的流行选择。然而,许多现有的pHMM数据库具有不同的设计重点,这使得用户很难决定正确的使用。在这次审查中,我们对宏基因组数据中病毒序列发现的多个常用谱HMM数据库进行了全面评估和比较.我们通过比较数据库的大小来表征数据库,它们的分类范围,以及使用定量指标的模型属性。随后,我们评估了它们在多个应用程序场景中的病毒识别性能,利用模拟和真实的宏基因组数据。我们的目标是为研究人员提供对不同数据库的优势和局限性的全面和批判性评估。此外,根据从模拟和真实的宏基因组数据中获得的实验结果,我们为用户提供了实用的建议,以优化他们对pHMM数据库的使用,从而提高他们在病毒宏基因组学领域发现的质量和可靠性。
    Profile hidden Markov models (pHMMs) are able to achieve high sensitivity in remote homology search, making them popular choices for detecting novel or highly diverged viruses in metagenomic data. However, many existing pHMM databases have different design focuses, making it difficult for users to decide the proper one to use. In this review, we provide a thorough evaluation and comparison for multiple commonly used profile HMM databases for viral sequence discovery in metagenomic data. We characterized the databases by comparing their sizes, their taxonomic coverage, and the properties of their models using quantitative metrics. Subsequently, we assessed their performance in virus identification across multiple application scenarios, utilizing both simulated and real metagenomic data. We aim to offer researchers a thorough and critical assessment of the strengths and limitations of different databases. Furthermore, based on the experimental results obtained from the simulated and real metagenomic data, we provided practical suggestions for users to optimize their use of pHMM databases, thus enhancing the quality and reliability of their findings in the field of viral metagenomics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    卵巢早衰(POF),干眼病(DED)是影响女性健康的关键问题。这里,我们探索了POF和DED共病的潜在机制,以进一步阐明疾病机制并改善治疗。从基因表达综合(GEO)数据库中鉴定出与POF(GSE39501)和DED(GSE44101)相关的数据集,并进行加权基因共表达网络(WGCNA)和差异表达基因(DEGs)分析。分别,在POF和DED中用于获得158个共病基因的交叉点。京都基因和基因组百科全书(KEGG)和基因本体论(GO)共病基因的分析表明,鉴定的基因主要与DNA复制和细胞周期有关,分别。共病基因的蛋白质-蛋白质相互作用(PPI)网络分析获得了15个hub基因:CDC20、BIRC5、PLK1、TOP2A、MCM5,MCM6,MCM7,MCM2,CENPA,FOXM1,GINS1,TIPIN,MAD2L1和CDCA3。为了验证分析结果,选择了其他POF和DED相关数据集(分别为GSE48873和GSE171043).miRNAs-lncRNAs-基因网络和机器学习方法用于进一步分析共病基因。DGIdb数据库识别了valdecoxib,amorfrutinA,和作为潜在药物的kaempferitrin。在这里,从生物信息学的角度鉴定了POF和DED的共病基因,为探索共病机制提供了新的策略,为POF和DED合并症的诊断和治疗开辟了新的方向。
    Premature ovarian failure (POF), which is often comorbid with dry eye disease (DED) is a key issue affecting female health. Here, we explored the mechanism underlying comorbid POF and DED to further elucidate disease mechanisms and improve treatment. Datasets related to POF (GSE39501) and DED (GSE44101) were identified from the Gene Expression Omnibus (GEO) database and subjected to weighted gene coexpression network (WGCNA) and differentially expressed genes (DEGs) analyses, respectively, with the intersection used to obtain 158 genes comorbid in POF and DED. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses of comorbid genes revealed that identified genes were primarily related to DNA replication and Cell cycle, respectively. Protein-Protein interaction (PPI) network analysis of comorbid genes obtained the 15 hub genes: CDC20, BIRC5, PLK1, TOP2A, MCM5, MCM6, MCM7, MCM2, CENPA, FOXM1, GINS1, TIPIN, MAD2L1, and CDCA3. To validate the analysis results, additional POF- and DED-related datasets (GSE48873 and GSE171043, respectively) were selected. miRNAs-lncRNAs-genes network and machine learning methods were used to further analysis comorbid genes. The DGIdb database identified valdecoxib, amorfrutin A, and kaempferitrin as potential drugs. Herein, the comorbid genes of POF and DED were identified from a bioinformatics perspective, providing a new strategy to explore the comorbidity mechanism, opening up a new direction for the diagnosis and treatment of comorbid POF and DED.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    海洋微生物群落是海洋生态系统运作和生物多样性保护的基础。随着宏基因组学和超转录组学在海洋环境研究中的应用,在分析整个微生物群落的功能方面取得了重大进展。这些分子技术高度依赖于可靠的,有很好的特点,微生物的全面和分类多样的测序参考转录组。在这里,我们介绍了一组12个单独的转录组组件,这些组件来自亚得里亚海北部的6种代表性硅藻物种,这些硅藻物种在2种环境相关的生长条件下生长(磷酸盐充足与磷酸盐剥夺)。过滤读取和程序集后,每个组装平均获得64,932个转录本,其中平均8856个被分配给功能已知的蛋白质。在所有分配的成绩单中,平均有6483种蛋白质被分类为硅藻(Bacillariophyta)。平均而言,在大量培养基条件下生长的硅藻的转录组组装体中检测到较高数量的指定蛋白质。平均而言,两种生长条件之间共有50%的映射蛋白质。数据集中所有记录的蛋白质被分为24个COG类别,大约25%属于未知函数,其余75%属于所有其他类别。由此产生的北亚得里亚海硅藻参考数据库,重点关注作为该地区特征和对未来世界海洋预测的养分限制的响应,为分析环境转移组和宏基因组数据提供了宝贵的资源。每个北亚得里亚海转录组本身也可以用作将来将产生的相关物种的(元)转录组和基因表达研究的参考数据库。
    Marine microbial communities form the basis for the functioning of marine ecosystems and the conservation of biodiversity. With the application of metagenomics and metatranscriptomics in marine environmental studies, significant progress has been made in analysing the functioning of microbial communities as a whole. These molecular techniques are highly dependent on reliable, well-characterised, comprehensive and taxonomically diverse sequenced reference transcriptomes of microbial organisms. Here we present a set of 12 individual transcriptome assemblies derived from 6 representative diatom species from the northern Adriatic Sea grown under 2 environmentally relevant growth conditions (phosphate replete vs. phosphate deprived). After filtering the reads and assembly, an average number of 64,932 transcripts per assembly was obtained, of which an average of 8856 were assigned to functionally known proteins. Of all assigned transcripts, an average of 6483 proteins were taxonomically assigned to diatoms (Bacillariophyta). On average, a higher number of assigned proteins was detected in the transcriptome assemblies of diatoms grown under replete media condition. On average, 50% of the mapped proteins were shared between the two growth conditions. All recorded proteins in the dataset were classified into 24 COG categories, with approximately 25% belonging to the unknown function and the remaining 75% belonging to all other categories. The resulting diatom reference database for the northern Adriatic, focussing on the response to nutrient limitation as characteristic for the region and predicted for the future world oceans, provides a valuable resource for analysing environmental metatranscriptome and metagenome data. Each northern Adriatic transcriptome can also be used by itself as a reference database for the (meta)transcriptomes and gene expression studies of the associated species that will be generated in the future.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    昆虫学证据在推断时间方面的意义,地点和死因已经在理论和实践上得到了证明。随着测序技术的进步,已经出现了关于嗜食性昆虫核基因组的报道,转录组,蛋白质组和线粒体基因组。然而,在法医昆虫学领域,目前没有可用的数据库可以集成,储存和分享食性昆虫的资源。缺乏数据库给昆虫学证据在司法实践中的应用带来不便,并阻碍了法医昆虫学学科的发展。鉴于此,我们开发了法医昆虫学数据库,包含10个核心功能模块:家庭,浏览,线粒体,蛋白质组,JBrowse,搜索,BLAST,工具,案例基础和地图。值得注意的是,“工具”模块启用多序列比对分析(肌肉),同源蛋白预测(Genewise),底漆设计(底漆),大规模基因组分析(Lastz),基因本体论和京都百科全书的基因和基因组富集分析,以及表达谱分析(PCA分析,聚类和相关热图)。此外,目前的数据库还可以作为研究人员的互动平台,分享法医昆虫学病例报告并上传数据和材料。该数据库为潜在访问者提供了一个全面的功能,用于多组学数据分析,为研究人员和犯罪现场调查人员提供了大量参考,并有助于在法庭上利用昆虫学证据。数据库URL:http://ihofe.com/。
    The significance of entomological evidence in inferring the time, location and cause of death has been demonstrated both theoretically and practically. With the advancement of sequencing technologies, reports have emerged on necrophagous insects\' nuclear genomes, transcriptomes, proteomes and mitochondrial genomes. However, within the field of forensic entomology, there is currently no available database that can integrate, store and share the resources of necrophagous insects. The absence of a database poses an inconvenience to the application of entomological evidence in judicial practice and hampers the development of the forensic entomology discipline. Given this, we have developed the Home Of Forensic Entomology database, encompassing 10 core functional modules: Home, Browse, Mitochondria, Proteome, JBrowse, Search, BLAST, Tools, Case base and Maps. Notably, the \'Tools\' module enables multiple sequence alignment analysis (Muscle), homologous protein prediction (Genewise), primer design (Primer), large-scale genomic analysis (Lastz), Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analysis, as well as expression profiling (PCA Analysis, Hcluster and Correlation Heatmap). In addition, the present database also works as an interactive platform for researchers by sharing forensic entomological case reports and uploading data and material. This database provides potential visitors with a comprehensive function for multi-omics data analysis, offers substantial references to researchers and criminal scene investigators and facilitates the utilization of entomological evidence in court. Database URL: http://ihofe.com/.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    罕见疾病的研究不仅对受影响的个体很重要,而且对医学知识的进步以及对人类生物学和遗传学的更深入理解也很重要。现在可以从可靠和准确的预测方法中获得广泛的结构信息,为研究Orpha.net数据库中审查的大多数罕见疾病的分子起源提供了机会。因此,有可能分析在涉及孟德尔罕见疾病(MRD)的2515种蛋白质中发现的致病性错义变体的拓扑结构,构成了我们结构生物信息学研究的数据库。负责MRD的氨基酸取代在不同的三维蛋白质深度显示不同的突变位点分布。然后,我们强调了我们数据库中存在的20,061个致病变体的致病变体的深度依赖性效应。这项结构生物信息学调查的结果是相关的,因为它们提供了额外的线索来减轻MRD造成的损害。
    The study of rare diseases is important not only for the individuals affected but also for the advancement of medical knowledge and a deeper understanding of human biology and genetics. The wide repertoire of structural information now available from reliable and accurate prediction methods provides the opportunity to investigate the molecular origins of most of the rare diseases reviewed in the Orpha.net database. Thus, it has been possible to analyze the topology of the pathogenic missense variants found in the 2515 proteins involved in Mendelian rare diseases (MRDs), which form the database for our structural bioinformatics study. The amino acid substitutions responsible for MRDs showed different mutation site distributions at different three-dimensional protein depths. We then highlighted the depth-dependent effects of pathogenic variants for the 20,061 pathogenic variants that are present in our database. The results of this structural bioinformatics investigation are relevant, as they provide additional clues to mitigate the damage caused by MRD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:随着牛津纳米孔技术的出现,现在可以对来自环境的16SrRNA进行现场测序。由于错误级别和结构,这些数据的分析需要一些参考序列的数据库。然而,许多来自复杂多样环境的分类单元,在公开可用的数据库中代表性较差。在本文中,我们建议METASEED管道从这样的环境中重建全长16S序列,以提高后续现场测序使用的参考。
    结果:我们表明,结合来自相同样品的16S和完整宏基因组的高精度短读取测序,使我们能够从更丰富的分类群重建高质量的16S序列。一个重要的新颖性是精心设计的与16S扩增子匹配的宏基因组读数集合,基于独特性和丰度的结合。与替代方法相比,这产生优异的结果。
    结论:我们的管道将促进与各种未知微生物相关的大量研究,从而允许理解不同的环境。该管道是为任何环境生成全长16SrRNA基因数据库的潜在工具。
    BACKGROUND: With the emergence of Oxford Nanopore technology, now the on-site sequencing of 16S rRNA from environments is available. Due to the error level and structure, the analysis of such data demands some database of reference sequences. However, many taxa from complex and diverse environments, have poor representation in publicly available databases. In this paper, we propose the METASEED pipeline for the reconstruction of full-length 16S sequences from such environments, in order to improve the reference for the subsequent use of on-site sequencing.
    RESULTS: We show that combining high-precision short-read sequencing of both 16S and full metagenome from the same samples allow us to reconstruct high-quality 16S sequences from the more abundant taxa. A significant novelty is the carefully designed collection of metagenome reads that matches the 16S amplicons, based on a combination of uniqueness and abundance. Compared to alternative approaches this produces superior results.
    CONCLUSIONS: Our pipeline will facilitate numerous studies associated with various unknown microorganisms, thus allowing the comprehension of the diverse environments. The pipeline is a potential tool in generating a full length 16S rRNA gene database for any environment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    基因本体论(GO)项目以标准化的方式描述了来自所有生命王国的生物体的基因产物的功能,能够对涉及全基因组分析的实验进行强大的分析。科学文献用于将实验结果转换为GO注释,对基因产物的功能进行系统分类。然而,为了解决这样一个事实,即所有基因中只有一小部分被实验表征,自GO成立以来,已经开发了多种预测方法来分配GO注释。新基因和具有已知功能的基因之间的序列同源性有助于近似这些非表征基因的作用。在这里,我们描述了产生注释的主要序列同源性方法:成对比较(BLAST),蛋白质谱模型(InterPro),和基于系统发育的注释(PAINT)。这些方法中的一些可以用基因组分析管道(BLAST和InterPro2GO)来实现,而油漆由GO财团策划。
    The Gene Ontology (GO) project describes the functions of the gene products of organisms from all kingdoms of life in a standardized way, enabling powerful analyses of experiments involving genome-wide analysis. The scientific literature is used to convert experimental results into GO annotations that systematically classify gene products\' functions. However, to address the fact that only a minor fraction of all genes has been characterized experimentally, multiple predictive methods to assign GO annotations have been developed since the inception of GO. Sequence homologies between novel genes and genes with known functions help to approximate the roles of these non-characterized genes. Here we describe the main sequence homology methods to produce annotations: pairwise comparison (BLAST), protein profile models (InterPro), and phylogenetic-based annotation (PAINT). Some of these methods can be implemented with genome analysis pipelines (BLAST and InterPro2GO), while PAINT is curated by the GO consortium.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    神经性疼痛(NP)总是伴有抑郁症状,严重影响身心健康。在这项研究中,我们鉴定了NP和重度抑郁障碍(MDD)的共同hub基因(Co-hub基因)和相关免疫细胞,以确定它们是否具有共同的病理和分子机制.从基因表达综合(GEO)数据库下载NP和MDD表达数据。提取了NP和MDD的常见差异表达基因(Co-DEGs),并挖掘了hub基因和hub节点。协同DEG,集线器基因,分析和枢纽节点的基因本体论(GO)和京都基因和基因组百科全书(KEGG)富集。最后,集线器节点,并对基因进行分析,获得Co-hub基因。我们绘制了受试者工作特征(ROC)曲线,以评估Co-hub基因对MDD和NP的诊断影响。我们还通过ssGSEA鉴定了免疫浸润细胞成分并分析了它们之间的关系。对于GO和KEGG富集分析,93个Co-DEGs与生物过程(BP)相关,如纤维蛋白溶解,细胞组成(CC),如三级颗粒,和路径,如补语,和凝结级联。差异基因表达分析显示Co-hub基因ANGPT2、MMP9、PLAU、和TIMP2。根据ANGPT2和MMP9的表达对NP的诊断有一定的准确性。对免疫细胞成分差异的分析表明,激活的树突状细胞丰富,效应记忆CD8+T细胞,记忆B细胞,两组的调节性T细胞,具有统计学意义。总之,我们确定了6个与NP和MDD相关的Co-hub基因和4种免疫细胞类型。需要进一步的研究来确定这些基因和免疫细胞作为NP和MDD中潜在的诊断标志物或治疗靶标的作用。
    Neurological pain (NP) is always accompanied by symptoms of depression, which seriously affects physical and mental health. In this study, we identified the common hub genes (Co-hub genes) and related immune cells of NP and major depressive disorder (MDD) to determine whether they have common pathological and molecular mechanisms. NP and MDD expression data was downloaded from the Gene Expression Omnibus (GEO) database. Common differentially expressed genes (Co-DEGs) for NP and MDD were extracted and the hub genes and hub nodes were mined. Co-DEGs, hub genes, and hub nodes were analyzed for Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment. Finally, the hub nodes, and genes were analyzed to obtain Co-hub genes. We plotted Receiver operating characteristic (ROC) curves to evaluate the diagnostic impact of the Co-hub genes on MDD and NP. We also identified the immune-infiltrating cell component by ssGSEA and analyzed the relationship. For the GO and KEGG enrichment analyses, 93 Co-DEGs were associated with biological processes (BP), such as fibrinolysis, cell composition (CC), such as tertiary granules, and pathways, such as complement, and coagulation cascades. A differential gene expression analysis revealed significant differences between the Co-hub genes ANGPT2, MMP9, PLAU, and TIMP2. There was some accuracy in the diagnosis of NP based on the expression of ANGPT2 and MMP9. Analysis of differences in the immune cell components indicated an abundance of activated dendritic cells, effector memory CD8+ T cells, memory B cells, and regulatory T cells in both groups, which were statistically significant. In summary, we identified 6 Co-hub genes and 4 immune cell types related to NP and MDD. Further studies are needed to determine the role of these genes and immune cells as potential diagnostic markers or therapeutic targets in NP and MDD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:糖酵解和免疫代谢在急性心肌梗死(AMI)中发挥重要作用。因此,这项研究旨在鉴定和实验验证AMI中糖酵解相关的hub基因作为诊断生物标志物,并进一步探讨hub基因与免疫浸润的关系。
    方法:使用R软件分析AMI外周血单个核细胞(PBMC)的差异表达基因(DEGs)。糖酵解相关的DEGs(GRDEGs)使用注释数据库进行识别和分析,可视化,和集成发现(DAVID)功能丰富。使用STRING数据库构建蛋白质-蛋白质相互作用网络,并使用Cytoscape软件进行可视化。使用CIBERSORT进行AMI患者和稳定型冠状动脉疾病(SCAD)对照组之间的免疫浸润分析,GRDEGs与免疫细胞浸润的相关性分析。我们还绘制了列线图和受试者工作特征(ROC)曲线,以评估GRDEG对AMI发生的预测准确性。最后,使用逆转录-定量聚合酶链反应(RT-qPCR)和使用PBMC的蛋白质印迹对关键基因进行了实验验证。
    结果:在AMI后的第一天和4-6天,共鉴定出132个GRDEGs和56个GRDEGs,分别。富集分析表明,这些GRDEGs主要聚集在糖酵解/糖异生和代谢途径中。五个中心基因(HK2,PFKL,PKM,G6PD,和ALDOA)使用cytoHubba插件选择。免疫细胞和hub基因之间的联系表明HK2,PFKL,PKM,ALDOA与单核细胞和中性粒细胞呈显著正相关,而G6PD与中性粒细胞呈显著正相关。校正曲线,决策曲线分析,和ROC曲线表明五个中心GRDEGs对AMI具有较高的预测价值。此外,通过RT-qPCR和Western印迹对5个中心GRDEGs进行了验证.
    结论:我们得出的结论是HK2、PFKL、PKM,G6PD,ALDOA是AMI的中枢GRDEGs,在AMI的进展中起重要作用。本研究为AMI的治疗提供了一种新的潜在的免疫治疗方法。
    OBJECTIVE: Glycolysis and immune metabolism play important roles in acute myocardial infarction (AMI). Therefore, this study aimed to identify and experimentally validate the glycolysis-related hub genes in AMI as diagnostic biomarkers, and further explore the association between hub genes and immune infiltration.
    METHODS: Differentially expressed genes (DEGs) from AMI peripheral blood mononuclear cells (PBMCs) were analyzed using R software. Glycolysis-related DEGs (GRDEGs) were identified and analyzed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) for functional enrichment. A protein-protein interaction network was constructed using the STRING database and visualized using Cytoscape software. Immune infiltration analysis between patients with AMI and stable coronary artery disease (SCAD) controls was performed using CIBERSORT, and correlation analysis between GRDEGs and immune cell infiltration was performed. We also plotted nomograms and receiver operating characteristic (ROC) curves to assess the predictive accuracy of GRDEGs for AMI occurrence. Finally, key genes were experimentally validated using reverse transcription-quantitative polymerase chain reaction (RT-qPCR) and western blotting using PBMCs.
    RESULTS: A total of 132 GRDEGs and 56 GRDEGs were identified on the first day and 4-6 days after AMI, respectively. Enrichment analysis indicated that these GRDEGs were mainly clustered in the glycolysis/gluconeogenesis and metabolic pathways. Five hub genes (HK2, PFKL, PKM, G6PD, and ALDOA) were selected using the cytoHubba plugin. The link between immune cells and hub genes indicated that HK2, PFKL, PKM, and ALDOA were significantly positively correlated with monocytes and neutrophils, whereas G6PD was significantly positively correlated with neutrophils. The calibration curve, decision curve analysis, and ROC curves indicated that the five hub GRDEGs exhibited high predictive value for AMI. Furthermore, the five hub GRDEGs were validated by RT-qPCR and western blotting.
    CONCLUSIONS: We concluded that HK2, PFKL, PKM, G6PD, and ALDOA are hub GRDEGs in AMI and play important roles in AMI progression. This study provides a novel potential immunotherapeutic method for the treatment of AMI.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    代谢相关脂肪性肝炎(MASH)和溃疡性结肠炎(UC)表现出复杂的相互联系与免疫功能障碍,肠道微生物群的生态失调,和炎症途径的激活。本研究旨在鉴定和验证UC和MASH之间关键的丁酸代谢相关共享基因。临床信息和基因表达谱来源于基因表达综合(GEO)数据库。通过各种生物信息学方法鉴定了UC和MASH之间共享的丁酸代谢相关差异表达基因(sBM-DEGs)。进行了功能富集分析,使用基于sBM-DEGs的共识聚类算法将UC患者分为亚型。通过随机森林筛选sBM-DEGs中的关键基因,支持向量机-递归特征消除,和光梯度提升。使用独立数据集上的接受者操作特征(ROC)分析来评估这些基因的诊断功效。此外,特征基因的表达水平在多个独立数据集和人类样本中进行了验证.确定了UC和MASH之间的49个共享DEG,富集分析强调了免疫的重要参与,炎症,和代谢途径。丁酸代谢相关基因与这些DEGs的交叉产生10个sBM-DEGs。这些基因有助于使用无监督聚类方法鉴定UC患者的分子亚型。ANXA5、CD44和SLC16A1通过机器学习算法和特征重要性排名被确定为中心基因。ROC分析在各种数据集上证实了它们在UC和MASH中的诊断功效。此外,这三个hub基因的表达水平与免疫细胞呈显著相关。这些发现在独立的数据集和人体样本中得到了验证,证实了生物信息学分析结果。综合生物信息学确定了三个重要的生物标志物,ANXA5、CD44和SLC16A1,作为与丁酸代谢相关的DEGs。这些发现为丁酸代谢在UC和MASH发病机理中的作用提供了新的见解。表明其作为有价值的诊断生物标志物的潜力。
    Metabolic-associated steatohepatitis (MASH) and ulcerative colitis (UC) exhibit a complex interconnection with immune dysfunction, dysbiosis of the gut microbiota, and activation of inflammatory pathways. This study aims to identify and validate critical butyrate metabolism-related shared genes between both UC and MASH. Clinical information and gene expression profiles were sourced from the Gene Expression Omnibus (GEO) database. Shared butyrate metabolism-related differentially expressed genes (sBM-DEGs) between UC and MASH were identified via various bioinformatics methods. Functional enrichment analysis was performed, and UC patients were categorized into subtypes using the consensus clustering algorithm based on sBM-DEGs. Key genes within sBM-DEGs were screened through Random Forest, Support Vector Machines-Recursive Feature Elimination, and Light Gradient Boosting. The diagnostic efficacy of these genes was evaluated using receiver operating characteristic (ROC) analysis on independent datasets. Additionally, the expression levels of characteristic genes were validated across multiple independent datasets and human specimens. Forty-nine shared DEGs between UC and MASH were identified, with enrichment analysis highlighting significant involvement in immune, inflammatory, and metabolic pathways. The intersection of butyrate metabolism-related genes with these DEGs produced 10 sBM-DEGs. These genes facilitated the identification of molecular subtypes of UC patients using an unsupervised clustering approach. ANXA5, CD44, and SLC16A1 were pinpointed as hub genes through machine learning algorithms and feature importance rankings. ROC analysis confirmed their diagnostic efficacy in UC and MASH across various datasets. Additionally, the expression levels of these three hub genes showed significant correlations with immune cells. These findings were validated across independent datasets and human specimens, corroborating the bioinformatics analysis results. Integrated bioinformatics identified three significant biomarkers, ANXA5, CD44, and SLC16A1, as DEGs linked to butyrate metabolism. These findings offer new insights into the role of butyrate metabolism in the pathogenesis of UC and MASH, suggesting its potential as a valuable diagnostic biomarker.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号