Bioinformatics

生物信息学
  • 文章类型: Journal Article
    遗传疾病的大规模异质性需要对核苷酸序列改变进行更深入的检查,以增强新的靶向药物攻击点的发现。新测序技术的出现对于获得更可解释的基因组数据至关重要。与之前的短读相比,较长的长度可以更好地了解潜在的威胁健康的遗传异常。长读段提供了更准确的变异识别和基因组组装方法,表明核苷酸偏转相关研究的进展。在这次审查中,我们介绍了测序技术的历史背景,并展示了它们的好处和局限性,也是。此外,我们强调了短期和长期阅读方法之间的差异,包括他们在方法和评估方面的独特进步和困难。此外,我们提供了相应的生物信息学和当前应用的详细描述。
    The large-scale heterogeneity of genetic diseases necessitated the deeper examination of nucleotide sequence alterations enhancing the discovery of new targeted drug attack points. The appearance of new sequencing techniques was essential to get more interpretable genomic data. In contrast to the previous short-reads, longer lengths can provide a better insight into the potential health threatening genetic abnormalities. Long-reads offer more accurate variant identification and genome assembly methods, indicating advances in nucleotide deflect-related studies. In this review, we introduce the historical background of sequencing technologies and show their benefits and limits, as well. Furthermore, we highlight the differences between short- and long-read approaches, including their unique advances and difficulties in methodologies and evaluation. Additionally, we provide a detailed description of the corresponding bioinformatics and the current applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:多形性胶质母细胞瘤(GBM)是一种快速增长的脑胶质瘤,预后极差。这项研究旨在确定其表达与GBM患者的总体生存(OS)相关的关键基因。
    方法:使用PubMed进行了系统评价,Scopus,科克伦,和WebofScience直到2024年的旅程。两名研究人员根据新城堡渥太华量表(NOS)独立提取数据并评估研究质量。在随后的生物信息学研究中鉴定并考虑了发现其表达与存活相关的基因。还考虑使用STRING的蛋白质-蛋白质相互作用(PPI)关系分析来分析这些基因的产物。此外,也使用Cytoscape3.9.0软件鉴定了与GBM患者生存相关的最重要的基因。对于最终验证,使用GEPIA和CGGA(mRNAseq_325和mRNAseq_693)数据库进行OS分析。用GO生物过程2023进行基因集富集分析。
    结果:从4104篇文章的初始搜索中,来自24个国家的255项研究被纳入。研究描述了613种独特基因,其mRNA与GBM患者的OS显著相关,其中107项在2项或更多项研究中描述。根据NOS,131项研究是高质量的,而124项被认为是低质量研究。根据PPI网络,鉴定了31个关键靶基因。通路分析显示五个hub基因(IL6、NOTCH1、TGFB1、EGFR、和KDR)。然而,在验证研究中,只有,FN1基因在3个队列中有显著性.
    结论:我们成功鉴定了最重要的31个基因,其产物可能被认为是潜在的预后生物标志物以及GBM肿瘤创新治疗的候选靶基因。
    BACKGROUND: Glioblastoma multiforme (GBM) is a type of fast-growing brain glioma associated with a very poor prognosis. This study aims to identify key genes whose expression is associated with the overall survival (OS) in patients with GBM.
    METHODS: A systematic review was performed using PubMed, Scopus, Cochrane, and Web of Science up to Journey 2024. Two researchers independently extracted the data and assessed the study quality according to the New Castle Ottawa scale (NOS). The genes whose expression was found to be associated with survival were identified and considered in a subsequent bioinformatic study. The products of these genes were also analyzed considering protein-protein interaction (PPI) relationship analysis using STRING. Additionally, the most important genes associated with GBM patients\' survival were also identified using the Cytoscape 3.9.0 software. For final validation, GEPIA and CGGA (mRNAseq_325 and mRNAseq_693) databases were used to conduct OS analyses. Gene set enrichment analysis was performed with GO Biological Process 2023.
    RESULTS: From an initial search of 4104 articles, 255 studies were included from 24 countries. Studies described 613 unique genes whose mRNAs were significantly associated with OS in GBM patients, of which 107 were described in 2 or more studies. Based on the NOS, 131 studies were of high quality, while 124 were considered as low-quality studies. According to the PPI network, 31 key target genes were identified. Pathway analysis revealed five hub genes (IL6, NOTCH1, TGFB1, EGFR, and KDR). However, in the validation study, only, the FN1 gene was significant in three cohorts.
    CONCLUSIONS: We successfully identified the most important 31 genes whose products may be considered as potential prognosis biomarkers as well as candidate target genes for innovative therapy of GBM tumors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    天然植物来源在几种抗癌药物的开发中是必不可少的,比如长春新碱,长春碱,长春瑞滨,多西他赛,紫杉醇,喜树碱,依托泊苷,和替尼泊苷。然而,各种化疗由于不良反应而失败,耐药性,和目标特异性。研究人员现在专注于开发使用天然化合物来克服这些问题的药物。这些药物可以影响多个目标,减少了不良影响,并且对几种癌症都有效。开发一种新药是非常复杂的,贵,和耗时的过程。传统的药物发现方法需要长达15年的时间才能使新药进入市场,成本超过10亿美元。然而,最近的计算机辅助药物发现(CADD)的进步改变了这种情况。本文旨在全面描述从天然产物中鉴定抗癌药物的不同CADD方法。各种来源的数据,包括科学直接,Elsevier,NCBI,和WebofScience,在这篇评论中使用。计算机技术和优化算法可以在药物发现风险中提供通用的解决方案。基于结构的药物设计技术被广泛用于了解化学成分的分子水平相互作用和识别命中线索。这篇综述将讨论CADD的概念,在硅工具,药物发现中的虚拟筛选,以及天然产物作为抗癌疗法的概念。还将提供鉴定的分子的代表性实例。
    Natural plant sources are essential in the development of several anticancer drugs, such as vincristine, vinblastine, vinorelbine, docetaxel, paclitaxel, camptothecin, etoposide, and teniposide. However, various chemotherapies fail due to adverse reactions, drug resistance, and target specificity. Researchers are now focusing on developing drugs that use natural compounds to overcome these issues. These drugs can affect multiple targets, have reduced adverse effects, and are effective against several cancer types. Developing a new drug is a highly complex, expensive, and time-consuming process. Traditional drug discovery methods take up to 15 years for a new medicine to enter the market and cost more than one billion USD. However, recent Computer Aided Drug Discovery (CADD) advancements have changed this situation. This paper aims to comprehensively describe the different CADD approaches in identifying anticancer drugs from natural products. Data from various sources, including Science Direct, Elsevier, NCBI, and Web of Science, are used in this review. In-silico techniques and optimization algorithms can provide versatile solutions in drug discovery ventures. The structure-based drug design technique is widely used to understand chemical constituents\' molecular-level interactions and identify hit leads. This review will discuss the concept of CADD, in-silico tools, virtual screening in drug discovery, and the concept of natural products as anticancer therapies. Representative examples of molecules identified will also be provided.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Systematic Review
    SARS-CoV-2是一种包膜RNA病毒,可引起人类和动物的严重呼吸道疾病。它通过将刺突蛋白与宿主的血管紧张素转换酶2(ACE2)结合来感染细胞。蝙蝠被认为是病毒的天然宿主,人畜共患传播是一种重大风险,当人类与受感染的动物密切接触时就会发生。因此,理解人与人之间的相互联系,动物,环境健康对于预防和控制未来的冠状病毒爆发至关重要。这项工作旨在系统地回顾文献,以确定使哺乳动物适合病毒传播者的特征,并提出用于评估哺乳动物SARS-CoV-2的主要计算方法。基于这篇综述,有可能确定与文献中提到的传输相关的主要因素,例如ACE2的表达和与人类的接近,除了确定用于其研究的计算方法之外,比如机器学习,分子建模,计算模拟,其他人之间。工作发现有助于未来疫情的防控,提供有关传播因素的信息,并强调先进的计算方法在传染病研究中的重要性,这些方法可以更深入地了解传播模式,并有助于制定更有效的控制和干预策略。
    SARS-CoV-2 is an enveloped RNA virus that causes severe respiratory illness in humans and animals. It infects cells by binding the Spike protein to the host\'s angiotensin-converting enzyme 2 (ACE2). The bat is considered the natural host of the virus, and zoonotic transmission is a significant risk and can happen when humans come into close contact with infected animals. Therefore, understanding the interconnection between human, animal, and environmental health is important to prevent and control future coronavirus outbreaks. This work aimed to systematically review the literature to identify characteristics that make mammals suitable virus transmitters and raise the main computational methods used to evaluate SARS-CoV-2 in mammals. Based on this review, it was possible to identify the main factors related to transmissions mentioned in the literature, such as the expression of ACE2 and proximity to humans, in addition to identifying the computational methods used for its study, such as Machine Learning, Molecular Modeling, Computational Simulation, between others. The findings of the work contribute to the prevention and control of future outbreaks, provide information on transmission factors, and highlight the importance of advanced computational methods in the study of infectious diseases that allow a deeper understanding of transmission patterns and can help in the development of more effective control and intervention strategies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • DOI:
    文章类型: Journal Article
    2023年标志着应用大型语言模型(LLM)聊天机器人的探索激增,特别是ChatGPT,跨越各种学科。我们全年调查了ChatGPT在生物信息学和生物医学信息学各个领域的应用,涵盖组学,遗传学,生物医学文本挖掘,药物发现,生物医学图像理解,生物信息学编程,和生物信息学教育。我们的调查描述了这种聊天机器人在生物信息学中的当前优势和局限性,并提供了对未来发展的潜在途径的见解。
    The year 2023 marked a significant surge in the exploration of applying large language model (LLM) chatbots, notably ChatGPT, across various disciplines. We surveyed the applications of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肌腱鞘巨细胞瘤(TGCT)是一种良性肿瘤,主要在关节和滑囊内生长。然而,术后复发率高,从15%到45%不等。虽然放疗可以降低这种复发率,其作为标准治疗的适用性仍然存在争议。此外,TGCT的致病机制尚不清楚,这限制了有效治疗方法的发展。TGCT的不可预测的增长和高复发率增加了疾病管理的挑战。目前,由于缺乏稳定的细胞模型,我们对TGCT的理解主要依赖于病理切片分析.在这项研究中,我们首先回顾了两名接受放疗的女性TGCT患者的病历.然后,通过结合生物信息学和机器学习,我们从多个角度解释了TGCT的发病机制及其与其他疾病的关系。在对案例数据进行深入分析的基础上,我们为TGCT患者术后放疗提供了经验支持.此外,我们的进一步分析揭示了TGCT中差异表达基因的信号通路,以及它与骨关节炎和滑膜肉瘤的潜在关联。
    Tendon Sheath Giant Cell Tumor (TGCT) is a benign tumor that primarily grows within joints and bursae. However, it has a high postoperative recurrence rate, ranging from 15% to 45%. Although radiotherapy may reduce this recurrence rate, its applicability as a standard treatment is still controversial. Furthermore, the pathogenic mechanisms of TGCT are not clear, which limits the development of effective treatment methods. The unpredictable growth and high recurrence rate of TGCT adds to the challenges of disease management. Currently, our understanding of TGCT mainly depends on pathological slice analysis due to a lack of stable cell models. In this study, we first reviewed the medical records of two female TGCT patients who had undergone radiotherapy. Then, by combining bioinformatics and machine learning, we interpreted the pathogenesis of TGCT and its associations with other diseases from multiple perspectives. Based on a deep analysis of the case data, we provided empirical support for postoperative radiotherapy in TGCT patients. Additionally, our further analysis revealed the signaling pathways of differentially expressed genes in TGCT, as well as its potential associations with osteoarthritis and synovial sarcomas.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    准确的indel调用在精准医学中起着重要作用。基准测试indel集对彻底评估生物信息学管道的indel调用性能至关重要。在FDA主导的测序质量控制2期(SEQC2)项目中开发了具有一组已知阳性变体的参考样品,但是已知阳性集中的已知indel是有限的。该项目试图提供一组丰富的已知indel,通过关注其他癌症相关区域,这些indel将在翻译上更加相关。由42名审稿人完成的全面的手动审查过程,两位顾问,由三名研究人员组成的评审团大大丰富了另外516个indel设置的已知indel。扩展的基准索引集具有大范围的变异等位基因频率(VAF),其中87%的参考样品A中的VAF低于20%。参考样品A和indel集合可用于在较低范围内的较宽范围的VAF值上进行indel调用的综合基准测试。Indel长度也是可变的,但大多数低于10个碱基对(bps)。大多数indel都在编码区域内,其余的在基因调控区。尽管高置信度可以从稳健的研究设计和细致的人类审查中获得,这个广泛的indel集没有经过正交验证。扩展基准测试indel集,连同先前公布的已知阳性集中的indel,是用来在precisionFDA平台上举办的社区挑战中对indel调用管道进行基准测试的事实集。此基准测试indel集和参考样本可用于对indel调用管道的综合评估。此外,在手动审查过程中获得的见解和解决方案可以帮助提高这些管道的性能。
    Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    环状RNA(circularRNAs,circRNAs)在癌症的发展和进展中起着至关重要的作用。本研究旨在鉴定骨肉瘤的潜在circRNA生物标志物。在8个数据库中搜索了2010年1月至2023年9月发表的文章,以比较骨肉瘤和对照样品中的circRNA表达谱(人,动物和细胞系)。在随机效应模型下进行Meta分析。对不同样品和组织中的circRNAs进行亚组分析。使用受试者操作员特征曲线评估诊断值。基因本体论(GO)和京都基因和基因组百科全书(KEGG)富集分析探索了circRNA宿主基因的功能。circRNA-miRNA-mRNA轴描绘了骨肉瘤的调节机制。在226项原始研究中鉴定出1356种具有差异表达的circRNAs,在至少3项已发表的子研究中,仅报告了74项.荟萃分析鉴定了58个失调的circRNAs(52个上调和6个下调)。11个circRNAs在组织和细胞系中一致显示出失调,hsa_circ_0005721显示出作为骨肉瘤循环生物标志物的潜力。敏感性分析显示97%的一致性。曲线下总面积为0.87(95%CI,0.83-0.89)。GO和KEGG富集分析显示宿主基因参与癌症。circRNA-miRNA-mRNA轴特异性地揭示了骨肉瘤内的调节轴和相互作用。这项研究证明circRNAs是骨肉瘤的潜在诊断生物标志物。一致报道的失调circRNAs是骨肉瘤发病机制中的潜在生物标志物,hsa_circ_0005721作为诊断和治疗的潜在循环生物标志物。
    Circular RNAs (circRNAs) play a crucial role in cancer development and progression. This study aimed to identify potential circRNA biomarkers for osteosarcoma. Articles published from January 2010 to September 2023 were searched across eight databases to compare circRNA expression profiles in osteosarcoma and control samples (human, animal and cell lines). Meta-analysis was conducted under a random effects model. Subgroup analysis of circRNAs in different samples and tissues was performed. Diagnostic value was evaluated using receiver operator characteristic curves. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis explored functions of circRNA host genes. A circRNA-miRNA-mRNA axis depicted the regulatory mechanism in osteosarcoma. Among 1356 circRNAs with differential expression were identified across 226 original studies, only 74 were reported in at least three published sub-studies. Meta-analysis identified 58 dysregulated circRNAs (52 upregulated and 6 downregulated). Eleven circRNAs consistently showed dysregulation in tissues and cell lines, with hsa_circ_0005721 showing potential as a circulating biomarker in osteosarcoma. Sensitivity analysis demonstrated 97 % consistency. The overall area under the curve was 0.87 (95 % CI, 0.83-0.89). GO and KEGG enrichment analyses revealed host gene involvement in cancer. The circRNA-miRNA-mRNA axis revealed the regulatory axis and interactions within osteosarcoma specifically. This study demonstrates circRNAs as potential diagnostic biomarkers for osteosarcoma. Consistently reported dysregulated circRNAs are potential biomarkers in osteosarcoma pathogenesis, with hsa_circ_0005721 as a potential circulating biomarker for diagnosis and treatment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Systematic Review
    背景:挥发性有机化合物的非目标直接质谱分析在医疗保健和食品安全等领域具有许多潜在应用。然而,必须采用强大的数据处理协议,以确保研究是可复制的,并且可以实现实际应用。方便用户的数据处理和统计工具越来越多;然而,这些工具的使用没有被分析,它们也不一定适合每种数据类型。
    目的:本综述旨在分析当前使用的数据处理和分析工作流程,并检查方法学报告是否足以实现复制。
    方法:从WebofScience和Scopus数据库中确定的研究根据纳入标准进行了系统检查。实验,数据处理,并对相关研究的数据分析工作流程进行了综述。
    结果:从数据库中确定的459项研究中,共有110人符合纳入标准。很少有论文提供了足够的细节,可以准确地复制方法的所有方面,只有三个符合以前的指南报告实验方法。使用了广泛的数据处理方法,只有8篇论文(7.3%)采用了基本相似的工作流程,可以实现直接可比性。
    结论:需要开发标准化的工作流程和报告系统,以确保该领域的研究是可复制的,可比性,并保持高标准。因此,允许实现广泛的潜在应用。
    BACKGROUND: Untargeted direct mass spectrometric analysis of volatile organic compounds has many potential applications across fields such as healthcare and food safety. However, robust data processing protocols must be employed to ensure that research is replicable and practical applications can be realised. User-friendly data processing and statistical tools are becoming increasingly available; however, the use of these tools have neither been analysed, nor are they necessarily suited for every data type.
    OBJECTIVE: This review aims to analyse data processing and analytic workflows currently in use and examine whether methodological reporting is sufficient to enable replication.
    METHODS: Studies identified from Web of Science and Scopus databases were systematically examined against the inclusion criteria. The experimental, data processing, and data analysis workflows were reviewed for the relevant studies.
    RESULTS: From 459 studies identified from the databases, a total of 110 met the inclusion criteria. Very few papers provided enough detail to allow all aspects of the methodology to be replicated accurately, with only three meeting previous guidelines for reporting experimental methods. A wide range of data processing methods were used, with only eight papers (7.3%) employing a largely similar workflow where direct comparability was achievable.
    CONCLUSIONS: Standardised workflows and reporting systems need to be developed to ensure research in this area is replicable, comparable, and held to a high standard. Thus, allowing the wide-ranging potential applications to be realised.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    药物-药物相互作用(DDI)可产生不可预测的药理作用,并导致不良事件,有可能对生物体造成不可逆的损害。通过生物或药理分析检测DDI的传统方法耗时且昂贵,因此,迫切需要开发计算方法来有效预测药物-药物相互作用。目前,可以有效提取实体特征的深度学习和知识图技术已被广泛用于开发DDI预测方法。在这项研究中,我们旨在系统地回顾应用深度学习和图形知识的DDI预测研究。本文首先总结了现有的生物医学数据和与药物相关的公共数据库。然后,我们讨论了现有的药物相互作用预测方法,这些方法利用了深度学习和知识图谱技术,并将它们分为三个主要类别:基于深度学习的方法,基于知识图的方法,以及将深度学习与知识图谱相结合的方法。我们综合分析了常用的药物相关数据和各种DDI预测方法,并在基准数据集上比较这些预测方法。最后,我们简要讨论了与药物相互作用预测相关的挑战,包括非对称DDI预测和高阶DDI预测。
    Drug-drug interactions (DDIs) can produce unpredictable pharmacological effects and lead to adverse events that have the potential to cause irreversible damage to the organism. Traditional methods to detect DDIs through biological or pharmacological analysis are time-consuming and expensive, therefore, there is an urgent need to develop computational methods to effectively predict drug-drug interactions. Currently, deep learning and knowledge graph techniques which can effectively extract features of entities have been widely utilized to develop DDI prediction methods. In this research, we aim to systematically review DDI prediction researches applying deep learning and graph knowledge. The available biomedical data and public databases related to drugs are firstly summarized in this review. Then, we discuss the existing drug-drug interactions prediction methods which have utilized deep learning and knowledge graph techniques and group them into three main classes: deep learning-based methods, knowledge graph-based methods, and methods that combine deep learning with knowledge graph. We comprehensively analyze the commonly used drug related data and various DDI prediction methods, and compare these prediction methods on benchmark datasets. Finally, we briefly discuss the challenges related to drug-drug interactions prediction, including asymmetric DDIs prediction and high-order DDI prediction.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号