Essential genes

必需基因
  • 文章类型: Journal Article
    具有流线型基因组的细菌,拥有必需代谢网络的全功能基因,能够更有效地合成所需的产品,因此在工业应用中具有作为生产平台的优势。为了获得简化的底盘基因组,已经做出了大量努力来减少现有的细菌基因组。这项工作分为两类:理性还原和随机还原。在过去的几十年中,必需基因集的鉴定和各种基因组缺失技术的出现极大地促进了许多细菌的基因组减少。一些构建的基因组具有工业应用所需的特性,例如:增加基因组稳定性,改造能力,细胞生长,和生物材料生产力。一些基因组减少的菌株的生理表型的减少的生长和扰动可能限制它们作为优化的细胞工厂的应用。这篇综述评估了迄今为止在减少细菌基因组以构建合成生物学的最佳底盘方面取得的进展。包括:基本基因集的识别,基因组缺失技术,人工流线型基因组的性质和工业应用,在构建简化的基因组时遇到的障碍,和未来的前景。
    Bacteria with streamlined genomes, that harbor full functional genes for essential metabolic networks, are able to synthesize the desired products more effectively and thus have advantages as production platforms in industrial applications. To obtain streamlined chassis genomes, a large amount of effort has been made to reduce existing bacterial genomes. This work falls into two categories: rational and random reduction. The identification of essential gene sets and the emergence of various genome-deletion techniques have greatly promoted genome reduction in many bacteria over the past few decades. Some of the constructed genomes possessed desirable properties for industrial applications, such as: increased genome stability, transformation capacity, cell growth, and biomaterial productivity. The decreased growth and perturbations in physiological phenotype of some genome-reduced strains may limit their applications as optimized cell factories. This review presents an assessment of the advancements made to date in bacterial genome reduction to construct optimal chassis for synthetic biology, including: the identification of essential gene sets, the genome-deletion techniques, the properties and industrial applications of artificially streamlined genomes, the obstacles encountered in constructing reduced genomes, and the future perspectives.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    必需基因对任何生物体的生长和生存至关重要。机器学习方法补充了实验方法,以最大程度地减少本质分析所需的资源。以前的研究表明,需要发现对必需基因进行显著分类的相关特征,提高预测模型跨生物体的泛化性,并构建一个稳健的黄金标准作为列车数据的类标签,以增强预测能力。研究结果还表明,机器学习方法的一个显著限制是预测有条件的必需基因。基因的本质状态可以由于生物体的特定条件而改变。这篇综述探讨了应用于基本基因预测任务的各种方法,他们的长处,限制和负责有效计算预测必需基因的因素。我们讨论了特征的类别以及它们如何对重要性预测模型的分类性能做出贡献。五类功能,即,基因序列,蛋白质序列,网络拓扑,同源性和基于基因本体的特征,为秀丽隐杆线虫生成,以对其必要性预测能力进行比较分析。基于基因本体论的特征类别主要由于其与基因的生物学功能高度相关而优于其他特征类别。然而,拓扑特征类别提供了最高的判别能力,使其更适合于本质预测。机器学习预测必需基因条件的主要限制因素是无法获得可以训练分类器的感兴趣条件的标记数据。因此,合作机器学习可以进一步利用在条件本质预测中表现良好的模型。
    必要基因的鉴定是必要的,因为它提供了对核心结构和功能的理解,加速药物目标的发现,在其他功能中。最近的研究已经应用机器学习来补充必需基因的实验鉴定。然而,有几个因素限制了机器学习方法的性能。这篇综述旨在提供预测生物体必需基因的标准程序和资源。并强调了导致当前使用机器学习进行条件基因重要性预测的局限性的因素。特征的选择和ML技术被确定为有效预测必需基因的重要因素。
    Essential genes are critical for the growth and survival of any organism. The machine learning approach complements the experimental methods to minimize the resources required for essentiality assays. Previous studies revealed the need to discover relevant features that significantly classify essential genes, improve on the generalizability of prediction models across organisms, and construct a robust gold standard as the class label for the train data to enhance prediction. Findings also show that a significant limitation of the machine learning approach is predicting conditionally essential genes. The essentiality status of a gene can change due to a specific condition of the organism. This review examines various methods applied to essential gene prediction task, their strengths, limitations and the factors responsible for effective computational prediction of essential genes. We discussed categories of features and how they contribute to the classification performance of essentiality prediction models. Five categories of features, namely, gene sequence, protein sequence, network topology, homology and gene ontology-based features, were generated for Caenorhabditis elegans to perform a comparative analysis of their essentiality prediction capacity. Gene ontology-based feature category outperformed other categories of features majorly due to its high correlation with the genes\' biological functions. However, the topology feature category provided the highest discriminatory power making it more suitable for essentiality prediction. The major limiting factor of machine learning to predict essential genes conditionality is the unavailability of labeled data for interest conditions that can train a classifier. Therefore, cooperative machine learning could further exploit models that can perform well in conditional essentiality predictions.
    Identification of essential genes is imperative because it provides an understanding of the core structure and function, accelerating drug targets\' discovery, among other functions. Recent studies have applied machine learning to complement the experimental identification of essential genes. However, several factors are limiting the performance of machine learning approaches. This review aims to present the standard procedure and resources available for predicting essential genes in organisms, and also highlight the factors responsible for the current limitation in using machine learning for conditional gene essentiality prediction. The choice of features and ML technique was identified as an important factor to predict essential genes effectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    近年来,由于必需基因在生物体中的重要功能,这些基因引起了越来越多的关注。在用于鉴定必需基因的方法中,精确和高效的计算方法可以弥补昂贵和耗时的实验技术的不足。在这次审查中,我们收集了原核生物和真核生物中必需基因预测的研究,并总结了这些研究中使用的五种主要特征类型。五种类型的特征包括进化保守,域信息,网络拓扑,序列组分和表达水平。我们已经描述了如何实现这些特征的有用形式,并根据大肠杆菌MG1655,枯草芽孢杆菌168和人的数据评估了它们的性能。描述了这些功能的先决条件和适用范围。此外,我们研究了各种模型中用于加权特征的技术。为了促进该领域的研究人员,两个可用的在线工具,它们是免费获得的,可以直接用于预测原核生物和人类的基因重要性,被提及。本文为原核生物和真核生物中必需基因的鉴定提供了简单的指导。
    Essential genes have attracted increasing attention in recent years due to the important functions of these genes in organisms. Among the methods used to identify the essential genes, accurate and efficient computational methods can make up for the deficiencies of expensive and time-consuming experimental technologies. In this review, we have collected researches on essential gene predictions in prokaryotes and eukaryotes and summarized the five predominant types of features used in these studies. The five types of features include evolutionary conservation, domain information, network topology, sequence component and expression level. We have described how to implement the useful forms of these features and evaluated their performance based on the data of Escherichia coli MG1655, Bacillus subtilis 168 and human. The prerequisite and applicable range of these features is described. In addition, we have investigated the techniques used to weight features in various models. To facilitate researchers in the field, two available online tools, which are accessible for free and can be directly used to predict gene essentiality in prokaryotes and humans, were referred. This article provides a simple guide for the identification of essential genes in prokaryotes and eukaryotes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号