structure prediction

结构预测
  • 文章类型: Journal Article
    蛋白质通过运动执行其生物学功能。尽管使用基于深度学习的方法对蛋白质的三维静态结构进行高通量预测已被证明是可行的。预测构象运动仍然是一个挑战。纯数据驱动的机器学习方法在解决此类运动方面遇到困难,因为有关构象运动的可用实验室数据仍然有限。在这项工作中,我们开发了一种通过将物理能量景观信息集成到基于深度学习的方法中来生成蛋白质变构运动的方法。我们展示了当地充满活力的挫败感,它代表了控制蛋白质变构动力学的能量景观的局部特征的量化,可用于使AlphaFold2(AF2)能够预测蛋白质构象运动。从基态静态结构开始,这种综合方法产生了蛋白质构象运动的替代结构和途径,在输入的多序列比对序列中使用能量挫折特征的渐进增强。对于一个模型蛋白腺苷酸激酶,我们表明,产生的构象运动与可用的实验和分子动力学模拟数据是一致的。将该方法应用于另外两种蛋白质KaiB和核糖结合蛋白,其中涉及大幅度的构象变化,也可以成功地产生替代构象。我们还展示了如何提取AF2能源景观地形的整体特征,许多人认为这是黑匣子。将物理知识结合到基于深度学习的结构预测算法中提供了一种有用的策略来解决变构蛋白的动态结构预测的挑战。
    Proteins perform their biological functions through motion. Although high throughput prediction of the three-dimensional static structures of proteins has proved feasible using deep-learning-based methods, predicting the conformational motions remains a challenge. Purely data-driven machine learning methods encounter difficulty for addressing such motions because available laboratory data on conformational motions are still limited. In this work, we develop a method for generating protein allosteric motions by integrating physical energy landscape information into deep-learning-based methods. We show that local energetic frustration, which represents a quantification of the local features of the energy landscape governing protein allosteric dynamics, can be utilized to empower AlphaFold2 (AF2) to predict protein conformational motions. Starting from ground state static structures, this integrative method generates alternative structures as well as pathways of protein conformational motions, using a progressive enhancement of the energetic frustration features in the input multiple sequence alignment sequences. For a model protein adenylate kinase, we show that the generated conformational motions are consistent with available experimental and molecular dynamics simulation data. Applying the method to another two proteins KaiB and ribose-binding protein, which involve large-amplitude conformational changes, can also successfully generate the alternative conformations. We also show how to extract overall features of the AF2 energy landscape topography, which has been considered by many to be black box. Incorporating physical knowledge into deep-learning-based structure prediction algorithms provides a useful strategy to address the challenges of dynamic structure prediction of allosteric proteins.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    一种新的二维(2D)非MXene过渡金属碳化物,Mo3C2是使用USPEX代码找到的。综合第一性原理计算表明,Mo3C2单层表现出热,动态,和机械稳定性,这可以确保在实际应用中优异的耐久性。Lix@(3×3)-Mo3C2(x=1-36)和Nax@(3×3)-Mo3C2(x=1-32)的优化结构被确定为预期的阳极材料。金属Mo3C2片对Li表现出0.190eV的低扩散势垒,对Na表现出0.118eV的低扩散势垒,对Li表现出0.31-0.55V的低平均开路电压,对Na表现出0.18-0.48V的低平均开路电压。当吸附两层吸附原子时,Li和Na的理论能量容量为344和306mAhg-1,分别,与商业石墨相当。此外,Mo3C2衬底可以在高温下的锂化或碱化过程期间保持结构完整性。考虑到这些特点,我们提出的Mo3C2板坯是未来Li和Na离子电池的阳极材料的潜在候选者。
    A new two-dimensional (2D) non-MXene transition metal carbide, Mo3C2, was found using the USPEX code. Comprehensive first-principles calculations show that the Mo3C2 monolayer exhibits thermal, dynamic, and mechanical stability, which can ensure excellent durability in practical applications. The optimized structures of Lix@(3×3)-Mo3C2 (x = 1-36) and Nax@(3×3)-Mo3C2 (x = 1-32) were identified as prospective anode materials. The metallic Mo3C2 sheet exhibits low diffusion barriers of 0.190 eV for Li and 0.118 eV for Na and low average open circuit voltages of 0.31-0.55 V for Li and 0.18-0.48 V for Na. When adsorbing two layers of adatoms, the theoretical energy capacities are 344 and 306 mA h g-1 for Li and Na, respectively, which are comparable to that of commercial graphite. Moreover, the Mo3C2 substrate can maintain structural integrity during the lithiation or sodiation process at high temperature. Considering these features, our proposed Mo3C2 slab is a potential candidate as an anode material for future Li- and Na-ion batteries.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在这一章中,我们使用ColabFold网界面预测拟南芥受体同源跨膜RING-H2同工型1(RMR1)的结构与AlphaFold2的十字花素(CRU1)的C端分选决定簇复合,并进行分子动力学模拟以探测预测结构的动力学。我们的结果预测,CRU1的ctVSD的C端羧酸酯基团被RMR1的货物结合环的保守Arg89和CRU1的Arg468通过RMR1的货物结合袋中的负电荷残基识别。此处描述的程序可用于其他蛋白质复合物的建模。
    In this chapter, we predict the structure of the Arabidopsis receptor-homology-transmembrane-RING-H2 isoform 1 (RMR1) in complex with the C-terminal sorting determinant of cruciferin (CRU1) by AlphaFold2 using the ColabFold web interface and to perform molecular dynamics simulation to probe the dynamics of the predicted structures. Our results predict that the C-terminal carboxylate group of ctVSD of CRU1 is recognized by the conserved Arg89 of the cargo-binding loop of RMR1 and Arg468 of CRU1 by negative charge residues in the cargo-binding pocket of RMR1. The procedures described here are useful for modeling of other protein complexes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    光谱数据,特别是衍射数据,由于其全面的晶体学信息,对于材料表征至关重要。目前的晶相鉴定,然而,非常耗时。为了应对这一挑战,我们开发了一种基于卷积自注意神经网络(CPICANN)的实时晶体相标识符。对来自23073个不同无机晶体学信息文件的692190个模拟粉末X射线衍射(XRD)图案进行了培训,CPICANN展示了卓越的相位识别能力。在有和没有元素信息的情况下,对模拟的XRD图案进行单相鉴定可产生98.5和87.5%的准确度,分别,优于JADE软件(68.2和38.7%,分别)。在模拟XRD图案上的双相识别达到84.2和51.5%的精度,分别。在实验设置中,CPICANN实现了80%的识别准确率,超过JADE软件(61%)。将CPICANN集成到XRD细化软件中,将大大推进XRD材料表征的尖端技术。
    Spectroscopic data, particularly diffraction data, are essential for materials characterization due to their comprehensive crystallographic information. The current crystallographic phase identification, however, is very time consuming. To address this challenge, we have developed a real-time crystallographic phase identifier based on a convolutional self-attention neural network (CPICANN). Trained on 692 190 simulated powder X-ray diffraction (XRD) patterns from 23 073 distinct inorganic crystallographic information files, CPICANN demonstrates superior phase-identification power. Single-phase identification on simulated XRD patterns yields 98.5 and 87.5% accuracies with and without elemental information, respectively, outperforming JADE software (68.2 and 38.7%, respectively). Bi-phase identification on simulated XRD patterns achieves 84.2 and 51.5% accuracies, respectively. In experimental settings, CPICANN achieves an 80% identification accuracy, surpassing JADE software (61%). Integration of CPICANN into XRD refinement software will significantly advance the cutting-edge technology in XRD materials characterization.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    功能材料的定向合成是材料科学关注的焦点。作为最重要的功能材料之一,具有大的二次谐波产生效应和宽的光学带隙的红外非线性光学材料是迫切需要的。在这项工作中,在理论结构预测的指导下,第一系列非中心对称(NCS)碱土金属[PS4]基硫代磷酸盐LiCaPS4(Ama2),NaCaPS4(P21),KCaPS4(Pna21),RbCaPS4(Pna21),成功合成了CsCaPS4(Pna21)。综合表征表明,ACaPS4可以被视为有前途的红外NLO材料,具有宽带隙(3.77-3.86eV),中等双折射(0.027-0.064在1064nm),高激光诱导损伤阈值(LIDT,~10×AGS),和合适的相位匹配二次谐波产生响应(0.4-0.6×AGS)。结构性质分析表明,Ca-S键显示出不可忽略的共价特征,[PS4]与[CaSn]单元一起起决定带隙和SHG响应的主导作用。这项工作表明,Li-,Na和K类似物可能是有前途的红外非线性光学材料候选物,这是硫代磷酸盐体系中涉及红外(IR)非线性光学(NLO)晶体的“预测合成”的第一个成功案例,可能为高性能功能材料的设计和定向合成提供新的途径。未来。
    Oriented synthesis of functional materials is a focus of attention in material science. As one of the most important function materials, infrared nonlinear optical materials with large second harmonic generation effects and broad optical band gap are in urgent need. In this work, directed by the theoretical structure prediction, the first series of non-centrosymmetric (NCS) alkali-alkaline earth metal [PS4]-based thiophosphates LiCaPS4 (Ama2), NaCaPS4 (P21), KCaPS4 (Pna21), RbCaPS4 (Pna21), CsCaPS4 (Pna21) were successfully synthesized. Comprehensive characterizations reveal that ACaPS4 could be regarded as promising IR NLO materials, exhibiting wide band gap (3.77-3.86 eV), moderate birefringence (0.027-0.064 at 1064 nm), high laser-induced damage threshold (LIDT, ~10×AGS), and suitable phase-matching second harmonic generation responses (0.4-0.6×AGS). Structure-properties analyses illustrate that the Ca-S bonds show non-ignorable covalent feature, and [PS4] together with [CaSn] units play dominant roles to determine the band gap and SHG response. This work indicates that Li-, Na- and K- analogs may be promising infrared nonlinear optical material candidates, and this is the first successful case of \"prediction to synthesis\" involving infrared (IR) nonlinear optical (NLO) crystals in the thiophosphate system and may provide a new avenue to the design and oriented synthesis of high-performance function materials in the future.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近几十年来,抗体已经成为对抗疾病不可或缺的疗法,尤其是病毒感染。然而,有限的结构信息和劳动密集型的工程过程阻碍了它们的发展。幸运的是,深度学习方法的重大进步通过利用同源蛋白质的共同进化信息,促进了蛋白质结构和功能的精确预测。尽管取得了这些进展,由于其独特的进化和抗原结合区的高度灵活性,预测抗体的构象仍然具有挑战性.这里,为了应对这一挑战,我们提出了生物启发的抗体语言模型(BALM)。该模型是在一个庞大的数据集上训练的,该数据集包含3.36亿个40%的非冗余未标记抗体序列,捕获抗体特有的独特和保守特性。值得注意的是,BALM展示了在四个抗原结合预测任务中的卓越表现。此外,我们介绍BALMFold,从BALM派生的端到端方法,能够从单个序列快速预测完整的原子抗体结构。值得注意的是,BALMFold优于那些成熟的方法,如AlphaFold2,IgFold,抗体基准中的ESMFold和OmegaFold,通过减少对不必要试验的需求,显示出促进创新工程和简化治疗性抗体开发的巨大潜力。BALMFold结构预测服务器可在https://beamlab-sh.com/models/BALMFold免费获得。
    In recent decades, antibodies have emerged as indispensable therapeutics for combating diseases, particularly viral infections. However, their development has been hindered by limited structural information and labor-intensive engineering processes. Fortunately, significant advancements in deep learning methods have facilitated the precise prediction of protein structure and function by leveraging co-evolution information from homologous proteins. Despite these advances, predicting the conformation of antibodies remains challenging due to their unique evolution and the high flexibility of their antigen-binding regions. Here, to address this challenge, we present the Bio-inspired Antibody Language Model (BALM). This model is trained on a vast dataset comprising 336 million 40% nonredundant unlabeled antibody sequences, capturing both unique and conserved properties specific to antibodies. Notably, BALM showcases exceptional performance across four antigen-binding prediction tasks. Moreover, we introduce BALMFold, an end-to-end method derived from BALM, capable of swiftly predicting full atomic antibody structures from individual sequences. Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials. The BALMFold structure prediction server is freely available at https://beamlab-sh.com/models/BALMFold.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    山西陈醋微生物组编码多种细菌素。这项研究的目的是为了我,通过机器学习从山西老陈醋的大规模微生物组数据中筛选和表征新型广谱细菌素,分子模拟和活性验证。利用机器学习技术,根据山西老陈醋微生物组的代谢组学信息,从117552个代表性基因中创新性地开采出158个潜在细菌素,并鉴定出12个微生物在属水平上分泌细菌素。随后,采用AlphaFold2结构预测和分子动力学模拟,进一步筛选出8种高稳定性细菌素,并且通过大肠杆菌BL21表达系统证实它们都具有抑菌活性。然后,通过两步法纯化抗菌活性最强的基因_386319(命名为LAB-3)和基因_403047(命名为LAB-4),并通过质谱进行分析。两种细菌素均具有广谱抗菌活性,对金黄色葡萄球菌和大肠杆菌的最小抑菌浓度值为6.79μg/mL-15.31μg/mL。此外,分子对接分析表明LAB-3和LAB-4可以通过氢键与二氢叶酸还原酶相互作用,盐桥力和疏水性力。这些发现表明,这两种细菌素可以被认为是有前途的广谱抗微生物剂。
    Shanxi aged vinegar microbiome encodes a wide variety of bacteriocins. The aim of this study was to mine, screen and characterize novel broad-spectrum bacteriocins from the large-scale microbiome data of Shanxi aged vinegar through machine learning, molecular simulation and activity validation. A total of 158 potential bacteriocins were innovatively mined from 117,552 representative genes based on metatranscriptomic information from the Shanxi aged vinegar microbiome using machine learning techniques and 12 microorganisms were identified to secrete bacteriocins at the genus level. Subsequently, employing AlphaFold2 structure prediction and molecular dynamics simulations, eight bacteriocins with high stability were further screened, and all of them were confirmed to have bacteriostatic activity by the Escherichia coli BL21 expression system. Then, gene_386319 (named LAB-3) and gene_403047 (named LAB-4) with the strongest antibacterial activities were purified by two-step methods and analyzed by mass spectrometry. The two bacteriocins have broad-spectrum antimicrobial activity with minimum inhibitory concentration values of 6.79 μg/mL-15.31 μg/mL against Staphylococcus aureus and Escherichia coli. Furthermore, molecular docking analysis indicated that LAB-3 and LAB-4 could interact with dihydrofolate reductase through hydrogen bonds, salt-bridge forces and hydrophobic forces. These findings suggested that the two bacteriocins could be considered as promising broad-spectrum antimicrobial agents.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    X射线散射/衍射张量层析成像技术是以微米分辨率获取异质生物组织3D纹理信息的有前途的方法。然而,由于跨真实和互易空间的多维扫描,所述方法遭受长的总体采集时间。这里,引入了一种新的方法,利用数学建模获得每个照明扫描体积的3D互易信息,这相当于用于收集体素重建所需的完整交互信息的物理扫描过程。通过模拟的6D广角X射线衍射层析成像实验验证了虚拟往复扫描方案。该方法的理论验证代表了6D衍射张量层析成像的重要技术进步,也是在异质材料表征中普遍应用的关键步骤。
    X-ray scattering/diffraction tensor tomography techniques are promising methods to acquire the 3D texture information of heterogeneous biological tissues at micrometre resolution. However, the methods suffer from a long overall acquisition time due to multi-dimensional scanning across real and reciprocal space. Here, a new approach is introduced to obtain 3D reciprocal information of each illuminated scanning volume using mathematic modeling, which is equivalent to a physical scanning procedure for collecting the full reciprocal information required for voxel reconstruction. The virtual reciprocal scanning scheme was validated by a simulated 6D wide-angle X-ray diffraction tomography experiment. The theoretical validation of the method represents an important technological advancement for 6D diffraction tensor tomography and a crucial step towards pervasive applications in the characterization of heterogeneous materials.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近年来,环肽由于其多种生物活性而成为有前途的治疗方式。了解这些环肽及其复合物的结构对于解锁有关蛋白质靶-环肽相互作用的宝贵见解至关重要。这可以促进新型相关药物的开发。然而,进行实验观察既耗时又昂贵。计算机辅助药物设计方法在实际应用中不够实用。为了应对这一挑战,我们介绍HighFold,本研究中的AlphaFold衍生模型。通过整合头尾圆和二硫键结构的具体细节,HighFold模型可以准确预测环肽及其复合物的结构。与其他现有方法相比,我们的模型展示了卓越的预测性能,代表了结构活性研究的重大进展。HighFold模型可在https://github.com/hongliangduan/HighFold上公开访问。
    In recent years, cyclic peptides have emerged as a promising therapeutic modality due to their diverse biological activities. Understanding the structures of these cyclic peptides and their complexes is crucial for unlocking invaluable insights about protein target-cyclic peptide interaction, which can facilitate the development of novel-related drugs. However, conducting experimental observations is time-consuming and expensive. Computer-aided drug design methods are not practical enough in real-world applications. To tackles this challenge, we introduce HighFold, an AlphaFold-derived model in this study. By integrating specific details about the head-to-tail circle and disulfide bridge structures, the HighFold model can accurately predict the structures of cyclic peptides and their complexes. Our model demonstrates superior predictive performance compared to other existing approaches, representing a significant advancement in structure-activity research. The HighFold model is openly accessible at https://github.com/hongliangduan/HighFold.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    口腔中毒可引发多种生理反应,由所涉及的有毒物质决定。这样的后果之一是高氯血症,其特征是血液中氯化物水平升高,导致肾脏损害和氯离子调节受损。这里,我们进行了一项全面的全基因组分析,以调查与高氯血症相关的基因或蛋白质.我们的分析包括功能富集,蛋白质-蛋白质相互作用,基因表达,探索分子途径,以及鉴定导致高氯血症发展的潜在共有遗传因素。功能富集分析显示,高氯血症引起的口服中毒与4种蛋白质有关,例如Kelch样蛋白3,丝氨酸/苏氨酸蛋白激酶WNK4,丝氨酸/苏氨酸蛋白激酶WNK1和Cullin-3。蛋白质-蛋白质相互作用网络揭示了Cullin-3是一种特殊的蛋白质,显示18个节点的最大连接。转录组学分析的数据不足表明,缺乏这些蛋白质与人类相关功能与口服中毒之间直接相关的信息,高氯血症,或者代谢性酸中毒.Cullin-3蛋白的代谢途径显示其衍生物为磺胺,增加尿量,代谢性酸中毒导致高血压。基于分子对接结果分析,发现Cullin-3蛋白具有最低的结合能得分并且是合适的蛋白。此外,在未结合的Cullin-3中未观察到主要变化,并且所有三种肽结合的复合物显示所有系统在50ns模拟期间保持紧凑。我们的研究结果表明,Cullin-3蛋白是开发潜在药物靶标或未来研究生物标志物的坚实基础。
    Oral poisoning can trigger diverse physiological reactions, determined by the toxic substance involved. One such consequence is hyperchloremia, characterized by an elevated level of chloride in the blood and leads to kidney damage and impairing chloride ion regulation. Here, we conducted a comprehensive genome-wide analysis to investigate genes or proteins linked to hyperchloremia. Our analysis included functional enrichment, protein-protein interactions, gene expression, exploration of molecular pathways, and the identification of potential shared genetic factors contributing to the development of hyperchloremia. Functional enrichment analysis revealed that oral poisoning owing hyperchloremia is associated with 4 proteins e.g. Kelch-like protein 3, Serine/threonine-protein kinase WNK4, Serine/threonine-protein kinase WNK1 and Cullin-3. The protein-protein interaction network revealed Cullin-3 as an exceptional protein, displaying a maximum connection of 18 nodes. Insufficient data from transcriptomic analysis indicates that there are lack of information having direct associations between these proteins and human-related functions to oral poisoning, hyperchloremia, or metabolic acidosis. The metabolic pathway of Cullin-3 protein revealed that the derivative is Sulfonamide which play role in, increasing urine output, and metabolic acidosis resulted in hypertension. Based on molecular docking results analysis it found that Cullin-3 proteins has the lowest binding energies score and being suitable proteins. Moreover, no major variations were observed in unbound Cullin-3 and all three peptide bound complexes shows that all systems remain compact during 50 ns simulations. The results of our study revealed Cullin-3 proteins be a strong foundation for the development of potential drug targets or biomarker for future studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号