structure prediction

结构预测
  • 文章类型: Journal Article
    蛋白质通过运动执行其生物学功能。尽管使用基于深度学习的方法对蛋白质的三维静态结构进行高通量预测已被证明是可行的。预测构象运动仍然是一个挑战。纯数据驱动的机器学习方法在解决此类运动方面遇到困难,因为有关构象运动的可用实验室数据仍然有限。在这项工作中,我们开发了一种通过将物理能量景观信息集成到基于深度学习的方法中来生成蛋白质变构运动的方法。我们展示了当地充满活力的挫败感,它代表了控制蛋白质变构动力学的能量景观的局部特征的量化,可用于使AlphaFold2(AF2)能够预测蛋白质构象运动。从基态静态结构开始,这种综合方法产生了蛋白质构象运动的替代结构和途径,在输入的多序列比对序列中使用能量挫折特征的渐进增强。对于一个模型蛋白腺苷酸激酶,我们表明,产生的构象运动与可用的实验和分子动力学模拟数据是一致的。将该方法应用于另外两种蛋白质KaiB和核糖结合蛋白,其中涉及大幅度的构象变化,也可以成功地产生替代构象。我们还展示了如何提取AF2能源景观地形的整体特征,许多人认为这是黑匣子。将物理知识结合到基于深度学习的结构预测算法中提供了一种有用的策略来解决变构蛋白的动态结构预测的挑战。
    Proteins perform their biological functions through motion. Although high throughput prediction of the three-dimensional static structures of proteins has proved feasible using deep-learning-based methods, predicting the conformational motions remains a challenge. Purely data-driven machine learning methods encounter difficulty for addressing such motions because available laboratory data on conformational motions are still limited. In this work, we develop a method for generating protein allosteric motions by integrating physical energy landscape information into deep-learning-based methods. We show that local energetic frustration, which represents a quantification of the local features of the energy landscape governing protein allosteric dynamics, can be utilized to empower AlphaFold2 (AF2) to predict protein conformational motions. Starting from ground state static structures, this integrative method generates alternative structures as well as pathways of protein conformational motions, using a progressive enhancement of the energetic frustration features in the input multiple sequence alignment sequences. For a model protein adenylate kinase, we show that the generated conformational motions are consistent with available experimental and molecular dynamics simulation data. Applying the method to another two proteins KaiB and ribose-binding protein, which involve large-amplitude conformational changes, can also successfully generate the alternative conformations. We also show how to extract overall features of the AF2 energy landscape topography, which has been considered by many to be black box. Incorporating physical knowledge into deep-learning-based structure prediction algorithms provides a useful strategy to address the challenges of dynamic structure prediction of allosteric proteins.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    一种新的二维(2D)非MXene过渡金属碳化物,Mo3C2是使用USPEX代码找到的。综合第一性原理计算表明,Mo3C2单层表现出热,动态,和机械稳定性,这可以确保在实际应用中优异的耐久性。Lix@(3×3)-Mo3C2(x=1-36)和Nax@(3×3)-Mo3C2(x=1-32)的优化结构被确定为预期的阳极材料。金属Mo3C2片对Li表现出0.190eV的低扩散势垒,对Na表现出0.118eV的低扩散势垒,对Li表现出0.31-0.55V的低平均开路电压,对Na表现出0.18-0.48V的低平均开路电压。当吸附两层吸附原子时,Li和Na的理论能量容量为344和306mAhg-1,分别,与商业石墨相当。此外,Mo3C2衬底可以在高温下的锂化或碱化过程期间保持结构完整性。考虑到这些特点,我们提出的Mo3C2板坯是未来Li和Na离子电池的阳极材料的潜在候选者。
    A new two-dimensional (2D) non-MXene transition metal carbide, Mo3C2, was found using the USPEX code. Comprehensive first-principles calculations show that the Mo3C2 monolayer exhibits thermal, dynamic, and mechanical stability, which can ensure excellent durability in practical applications. The optimized structures of Lix@(3×3)-Mo3C2 (x = 1-36) and Nax@(3×3)-Mo3C2 (x = 1-32) were identified as prospective anode materials. The metallic Mo3C2 sheet exhibits low diffusion barriers of 0.190 eV for Li and 0.118 eV for Na and low average open circuit voltages of 0.31-0.55 V for Li and 0.18-0.48 V for Na. When adsorbing two layers of adatoms, the theoretical energy capacities are 344 and 306 mA h g-1 for Li and Na, respectively, which are comparable to that of commercial graphite. Moreover, the Mo3C2 substrate can maintain structural integrity during the lithiation or sodiation process at high temperature. Considering these features, our proposed Mo3C2 slab is a potential candidate as an anode material for future Li- and Na-ion batteries.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    分泌的信号肽是生长的中心调节因子,发展,和应激反应,但是这些肽及其受体进化的具体步骤还没有得到很好的理解。此外,肽-受体结合的分子机制只有几个例子是已知的,主要是由于全球很少的实验室对蛋白质结构测定能力的可用性有限。植物已经进化出大量分泌的信号肽和相应的跨膜受体。应激反应性丝氨酸富内源性肽(SCOOPs)最近被鉴定。生物活性SCOOP被枯草杆菌酶蛋白水解处理,并被模型植物拟南芥中富含亮氨酸的重复受体激酶男性发现因子1-相互作用受体样激酶2(MIK2)感知。SCOOP和MIK2是如何(共同)进化的,以及SCOOP如何与MIK2结合是未知的。使用350个植物基因组的计算机模拟分析和随后的功能测试,我们揭示了MIK2作为SCOOP受体的保守性。然后,我们利用基于AI的结构建模和比较基因组学来鉴定两个保守的假定SCOOP-MIK2结合口袋,这些同源物预测与序列不同的SCOOP的“SxS”基序相互作用。两个预测的结合口袋的诱变损害了SCOOP与MIK2的结合,SCOOP诱导的MIK2与其共受体的胆碱酯酶不敏感1相关激酶1之间的复合物形成,以及SCOOP诱导的活性氧产生,因此,证实了我们的预测.总的来说,除了揭示难以捉摸的SCOOP-MIK2结合机制外,我们的分析管道结合了系统基因组学,基于人工智能的结构预测,实验生化和生理验证为阐明肽配体-受体感知机制提供了蓝图。
    Secreted signaling peptides are central regulators of growth, development, and stress responses, but specific steps in the evolution of these peptides and their receptors are not well understood. Also, the molecular mechanisms of peptide-receptor binding are only known for a few examples, primarily owing to the limited availability of protein structural determination capabilities to few laboratories worldwide. Plants have evolved a multitude of secreted signaling peptides and corresponding transmembrane receptors. Stress-responsive SERINE RICH ENDOGENOUS PEPTIDES (SCOOPs) were recently identified. Bioactive SCOOPs are proteolytically processed by subtilases and are perceived by the leucine-rich repeat receptor kinase MALE DISCOVERER 1-INTERACTING RECEPTOR-LIKE KINASE 2 (MIK2) in the model plant Arabidopsis thaliana. How SCOOPs and MIK2 have (co)evolved, and how SCOOPs bind to MIK2 are unknown. Using in silico analysis of 350 plant genomes and subsequent functional testing, we revealed the conservation of MIK2 as SCOOP receptor within the plant order Brassicales. We then leveraged AI-based structural modeling and comparative genomics to identify two conserved putative SCOOP-MIK2 binding pockets across Brassicales MIK2 homologues predicted to interact with the \"SxS\" motif of otherwise sequence-divergent SCOOPs. Mutagenesis of both predicted binding pockets compromised SCOOP binding to MIK2, SCOOP-induced complex formation between MIK2 and its coreceptor BRASSINOSTEROID INSENSITIVE 1-ASSOCIATED KINASE 1, and SCOOP-induced reactive oxygen species production, thus, confirming our in silico predictions. Collectively, in addition to revealing the elusive SCOOP-MIK2 binding mechanism, our analytic pipeline combining phylogenomics, AI-based structural predictions, and experimental biochemical and physiological validation provides a blueprint for the elucidation of peptide ligand-receptor perception mechanisms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    了解蛋白质在选择压力下如何进化是一个长期的挑战。搜索空间的巨大限制了系统地评估多个同时突变的影响,所以突变通常是单独评估的。然而,上位性,或者突变相互作用的方式,基于对单个突变的测量,阻止了对组合突变的准确预测。这里,我们使用人工智能来定义蛋白质结合位点的整个功能序列景观,我们称这种方法为完全组合突变计数(CCME)。通过利用CCME,我们能够在这个功能序列景观中构建一个完整的进化连接图。作为概念的证明,我们将CCME应用于SARS-CoV-2刺突蛋白受体结合域的ACE2结合位点。我们从整个功能序列景观中选择了代表性的变体用于实验室测试。我们确定了尽管改变了超过40%的评估残基位置,但仍保留了结合ACE2的功能的变体,和变体现在逃避结合和单克隆抗体的中和。这项工作代表了朝着实现病原体进化的精确预测迈出的关键第一步,开辟主动缓解的途径。
    Understanding how proteins evolve under selective pressure is a longstanding challenge. The immensity of the search space has limited efforts to systematically evaluate the impact of multiple simultaneous mutations, so mutations have typically been assessed individually. However, epistasis, or the way in which mutations interact, prevents accurate prediction of combinatorial mutations based on measurements of individual mutations. Here, we use artificial intelligence to define the entire functional sequence landscape of a protein binding site in silico, and we call this approach Complete Combinatorial Mutational Enumeration (CCME). By leveraging CCME, we are able to construct a comprehensive map of the evolutionary connectivity within this functional sequence landscape. As a proof of concept, we applied CCME to the ACE2 binding site of the SARS-CoV-2 spike protein receptor binding domain. We selected representative variants from across the functional sequence landscape for testing in the laboratory. We identified variants that retained functionality to bind ACE2 despite changing over 40% of evaluated residue positions, and the variants now escape binding and neutralization by monoclonal antibodies. This work represents a crucial initial stride toward achieving precise predictions of pathogen evolution, opening avenues for proactive mitigation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人工智能彻底改变了蛋白质结构预测领域。然而,随着更强大、更复杂的软件的开发,它是可访问性和易用性,而不是功能,正在迅速成为最终用户的限制因素。LazyAF是一个基于GoogleColaboratory的管道,它集成了现有的ColabFoldBATCH软件,以简化中等规模的蛋白质-蛋白质相互作用预测过程。LazyAF用于预测在广泛宿主范围的多药抗性质粒RK2上编码的76种蛋白质的相互作用组,证明了管道提供的易用性和可及性。
    Artificial intelligence has revolutionized the field of protein structure prediction. However, with more powerful and complex software being developed, it is accessibility and ease of use rather than capability that is quickly becoming a limiting factor to end users. LazyAF is a Google Colaboratory-based pipeline which integrates the existing ColabFold BATCH software to streamline the process of medium-scale protein-protein interaction prediction. LazyAF was used to predict the interactome of the 76 proteins encoded on the broad-host-range multi-drug resistance plasmid RK2, demonstrating the ease and accessibility the pipeline provides.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    光谱数据,特别是衍射数据,由于其全面的晶体学信息,对于材料表征至关重要。目前的晶相鉴定,然而,非常耗时。为了应对这一挑战,我们开发了一种基于卷积自注意神经网络(CPICANN)的实时晶体相标识符。对来自23073个不同无机晶体学信息文件的692190个模拟粉末X射线衍射(XRD)图案进行了培训,CPICANN展示了卓越的相位识别能力。在有和没有元素信息的情况下,对模拟的XRD图案进行单相鉴定可产生98.5和87.5%的准确度,分别,优于JADE软件(68.2和38.7%,分别)。在模拟XRD图案上的双相识别达到84.2和51.5%的精度,分别。在实验设置中,CPICANN实现了80%的识别准确率,超过JADE软件(61%)。将CPICANN集成到XRD细化软件中,将大大推进XRD材料表征的尖端技术。
    Spectroscopic data, particularly diffraction data, are essential for materials characterization due to their comprehensive crystallographic information. The current crystallographic phase identification, however, is very time consuming. To address this challenge, we have developed a real-time crystallographic phase identifier based on a convolutional self-attention neural network (CPICANN). Trained on 692 190 simulated powder X-ray diffraction (XRD) patterns from 23 073 distinct inorganic crystallographic information files, CPICANN demonstrates superior phase-identification power. Single-phase identification on simulated XRD patterns yields 98.5 and 87.5% accuracies with and without elemental information, respectively, outperforming JADE software (68.2 and 38.7%, respectively). Bi-phase identification on simulated XRD patterns achieves 84.2 and 51.5% accuracies, respectively. In experimental settings, CPICANN achieves an 80% identification accuracy, surpassing JADE software (61%). Integration of CPICANN into XRD refinement software will significantly advance the cutting-edge technology in XRD materials characterization.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从它的概念来看,X射线晶体学提供了对结构的独特理解,材料的键合和电子状态,which,反过来,解锁检查晶体系统的性质和功能的手段。使用最先进的单晶X射线衍射,随着紫外-可见光谱和DFT计算,Zwolenik等人。[(2024)。IUCrJ,11,519-527]提供了对1,3-二乙酰基芘的结构-光学性质关系的全面研究,其方法越来越多地为非专业实验室所用。
    From its conception, X-ray crystallography has provided a unique understanding of the structure, bonding and electronic state of materials, which, in turn, unlocks a means of examining the properties and function of crystalline systems. Using state-of-the-art single-crystal X-ray diffraction, along with UV-Vis spectroscopy and DFT calculations, Zwolenik et al. [(2024). IUCrJ, 11, 519-527] have provided a comprehensive study of the structure-optical property relationship of 1,3-diacetylpyrene with methodologies that are increasingly accessible to non-specialist laboratories.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    蛋白质中位点突变率的变化在很大程度上可以通过蛋白质必须折叠成稳定结构的约束来理解。基于蛋白质结构计算位点特异性速率的模型和热力学稳定性模型显示出显著但适度的预测从序列计算的经验位点特异性速率的能力。使用蛋白质能量学的详细原子模型的模型并不能胜过使用堆积密度的更简单方法。我们证明了这样做的根本原因是,经验性的特定地点率是系统发育中许多不同微环境平均影响的结果。通过分析进化动力学模拟的结果,我们展示了在许多现有蛋白质结构中平均位点特异性速率如何导致位点速率预测的正确恢复。该结果也在天然蛋白质序列和实验结构中得到证实。使用预测的结构,我们证明,原子模型可以改善接触密度指标在预测结构的特定部位率方面。结果为控制蛋白质家族中位点特异性速率分布的因素提供了基本见解。
    Variation in mutation rates at sites in proteins can largely be understood by the constraint that proteins must fold into stable structures. Models that calculate site-specific rates based on protein structure and a thermodynamic stability model have shown a significant but modest ability to predict empirical site-specific rates calculated from sequence. Models that use detailed atomistic models of protein energetics do not outperform simpler approaches using packing density. We demonstrate that a fundamental reason for this is that empirical site-specific rates are the result of the average effect of many different microenvironments in a phylogeny. By analyzing the results of evolutionary dynamics simulations, we show how averaging site-specific rates across many extant protein structures can lead to correct recovery of site-rate prediction. This result is also demonstrated in natural protein sequences and experimental structures. Using predicted structures, we demonstrate that atomistic models can improve upon contact density metrics in predicting site-specific rates from a structure. The results give fundamental insights into the factors governing the distribution of site-specific rates in protein families.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    使用基于深度学习(DL)的结构预测算法对从头设计的蛋白质进行计算机验证已成为主流。然而,缺乏高质量预测模型与实验成功机会之间关系的正式证据。我们使用了经过实验表征的从头水溶性和跨膜β桶设计,以表明AlphaFold2和ESMFold在不同任务中表现出色。ESMFold可以有效地识别基于高质量(可设计)主干生成的设计。然而,只有AlphaFold2可以预测哪些序列在相似设计中具有实验折叠的最佳机会。我们证明了ESMFold可以从几个预测的接触中生成高质量的结构,并引入了一种基于预测的增量扰动的新方法(“硅熔化”),这可以揭示设计之间存在有利接触的差异。这项研究为基于DL的结构预测模型的可解释性以及如何利用它们来设计日益复杂的蛋白质提供了新的见解;特别是历史上缺乏基本的计算机验证工具的膜蛋白。
    In silico validation of de novo designed proteins with deep learning (DL)-based structure prediction algorithms has become mainstream. However, formal evidence of the relationship between a high-quality predicted model and the chance of experimental success is lacking. We used experimentally characterized de novo water-soluble and transmembrane β-barrel designs to show that AlphaFold2 and ESMFold excel at different tasks. ESMFold can efficiently identify designs generated based on high-quality (designable) backbones. However, only AlphaFold2 can predict which sequences have the best chance of experimentally folding among similar designs. We show that ESMFold can generate high-quality structures from just a few predicted contacts and introduce a new approach based on incremental perturbation of the prediction (\"in silico melting\"), which can reveal differences in the presence of favorable contacts between designs. This study provides a new insight on DL-based structure prediction models explainability and on how they could be leveraged for the design of increasingly complex proteins; in particular membrane proteins which have historically lacked basic in silico validation tools.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号