关键词: directed evolution landscape protein design protein engineering sequence-performance mapping

Mesh : Directed Molecular Evolution Protein Engineering Biotechnology Proteins / genetics Amino Acid Sequence

来  源:   DOI:10.1016/j.cels.2023.06.009   PDF(Pubmed)

Abstract:
Discovery and evolution of new and improved proteins has empowered molecular therapeutics, diagnostics, and industrial biotechnology. Discovery and evolution both require efficient screens and effective libraries, although they differ in their challenges because of the absence or presence, respectively, of an initial protein variant with the desired function. A host of high-throughput technologies-experimental and computational-enable efficient screens to identify performant protein variants. In partnership, an informed search of sequence space is needed to overcome the immensity, sparsity, and complexity of the sequence-performance landscape. Early in the historical trajectory of protein engineering, these elements aligned with distinct approaches to identify the most performant sequence: selection from large, randomized combinatorial libraries versus rational computational design. Substantial advances have now emerged from the synergy of these perspectives. Rational design of combinatorial libraries aids the experimental search of sequence space, and high-throughput, high-integrity experimental data inform computational design. At the core of the collaborative interface, efficient protein characterization (rather than mere selection of optimal variants) maps sequence-performance landscapes. Such quantitative maps elucidate the complex relationships between protein sequence and performance-e.g., binding, catalytic efficiency, biological activity, and developability-thereby advancing fundamental protein science and facilitating protein discovery and evolution.
摘要:
新的和改进的蛋白质的发现和进化赋予了分子疗法,诊断,工业生物技术。发现和进化都需要高效的屏幕和有效的库,尽管由于缺席或存在,他们的挑战有所不同,分别,具有所需功能的初始蛋白质变体。大量的高通量技术-实验性和计算性-使有效的筛选能够识别高性能蛋白质变体。在伙伴关系中,需要对序列空间进行明智的搜索来克服巨大的,稀疏,和序列性能景观的复杂性。在蛋白质工程的历史轨迹早期,这些元素与不同的方法对齐,以识别性能最高的序列:从大的选择,随机组合库与合理计算设计。这些观点的协同作用现已取得实质性进展。组合库的合理设计有助于序列空间的实验搜索,和高通量,高完整性的实验数据为计算设计提供信息。在协作接口的核心,有效的蛋白质表征(而不仅仅是选择最佳变体)绘制了序列性能景观。这样的定量图阐明了蛋白质序列和性能之间的复杂关系-例如,绑定,催化效率,生物活性,和可开发性-从而推进基础蛋白质科学并促进蛋白质发现和进化。
公众号