Mesh : Humans Antibodies, Neutralizing / chemistry genetics immunology Antibodies, Viral / chemistry genetics immunology Antibody Affinity Antigen-Antibody Complex / chemistry COVID-19 / virology immunology Models, Molecular Protein Conformation Protein Engineering SARS-CoV-2 / immunology genetics Directed Molecular Evolution / methods

来  源:   DOI:10.1126/science.adk8946

Abstract:
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
摘要:
仅根据序列信息训练的大型语言模型就可以学习蛋白质设计的高级原理。然而,超越顺序,蛋白质的三维结构决定了它们的特定功能,活动,和可进化性。这里,我们证明了一个用蛋白质结构骨架坐标增强的一般蛋白质语言模型可以指导不同蛋白质的进化,而不需要对单个功能任务进行建模。我们还证明了仅在单链结构上训练的ESM-IF1,可以扩展到工程蛋白质复合物。使用这种方法,我们筛选了用于治疗严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染的两种治疗性临床抗体的约30种变体.我们分别实现了对关注BQ.1.1和XBB.1.5的抗体逃逸病毒变体的中和和亲和力的高达25倍的改善和37倍的改善。这些发现强调了整合结构信息以识别有效的蛋白质进化轨迹而不需要任何特定任务的训练数据的优势。
公众号