computational structural biology

计算结构生物学
  • 文章类型: Journal Article
    获得性免疫缺陷综合症(AIDS)是由人类免疫缺陷病毒(HIV)引起的。HIV蛋白酶,逆转录酶,整合酶是目前治疗这种疾病的药物的靶点。然而,由于病毒的高突变率,抗病毒耐药株迅速出现,导致对新药开发的需求。一个有吸引力的靶标是Gag-Pol多蛋白,在艾滋病毒的生命周期中起着关键作用。最近,我们发现HIV-1整合酶中M50I和V151I突变的组合可以抑制病毒释放,抑制Gag-Pol自加工和成熟的启动,而不干扰Gag-Pol的二聚化.逆转录酶中整合酶或RNaseH结构域的其他突变可以弥补该缺陷。然而,分子机制未知。没有可用于进一步研究的全长HIV-1Pol蛋白的三级结构。因此,我们开发了一个工作流程来预测HIV-1NL4.3Pol多蛋白的三级结构.与最近公布的部分HIV-1Pol结构(PDBID:7SJX)相比,模型结构具有相当的质量。我们的HIV-1NL4.3Pol二聚体模型是第一个全长Pol三级结构。它可以为研究HIV-1Pol的自动处理机制和开发新的有效药物提供结构平台。此外,该工作流程可用于预测无法通过常规实验方法解析的其他大型蛋白质结构。
    Acquired immunodeficiency syndrome (AIDS) is caused by human immunodeficiency virus (HIV). HIV protease, reverse transcriptase, and integrase are targets of current drugs to treat the disease. However, anti-viral drug-resistant strains have emerged quickly due to the high mutation rate of the virus, leading to the demand for the development of new drugs. One attractive target is Gag-Pol polyprotein, which plays a key role in the life cycle of HIV. Recently, we found that a combination of M50I and V151I mutations in HIV-1 integrase can suppress virus release and inhibit the initiation of Gag-Pol autoprocessing and maturation without interfering with the dimerization of Gag-Pol. Additional mutations in integrase or RNase H domain in reverse transcriptase can compensate for the defect. However, the molecular mechanism is unknown. There is no tertiary structure of the full-length HIV-1 Pol protein available for further study. Therefore, we developed a workflow to predict the tertiary structure of HIV-1 NL4.3 Pol polyprotein. The modeled structure has comparable quality compared with the recently published partial HIV-1 Pol structure (PDB ID: 7SJX). Our HIV-1 NL4.3 Pol dimer model is the first full-length Pol tertiary structure. It can provide a structural platform for studying the autoprocessing mechanism of HIV-1 Pol and for developing new potent drugs. Moreover, the workflow can be used to predict other large protein structures that cannot be resolved via conventional experimental methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从头蛋白质设计增强了我们对控制蛋白质折叠和相互作用的原理的理解,并有可能通过新型蛋白质功能的工程彻底改变生物技术。尽管计算设计策略最近取得了进展,蛋白质结构的从头设计仍然具有挑战性,考虑到序列结构空间的巨大尺寸。AlphaFold2(AF2),最先进的神经网络架构,在从氨基酸序列预测蛋白质结构方面取得了显著的准确性。这提出了一个问题,即AF2是否已经充分了解了蛋白质折叠的原理以进行从头设计。这里,我们试图通过反转AF2网络来回答这个问题,使用预测权重集和损失函数将生成的序列偏置为采用目标折叠。初步设计试验导致从头设计,与天然蛋白质家族相比,蛋白质表面上的疏水性残基过多。需要额外的表面优化。设计的计算机验证显示蛋白质结构具有正确的折叠,亲水表面和密集堆积的疏水核心。体外验证显示,39种设计中的7种在具有高解链温度的溶液中是折叠和稳定的。总之,我们的设计工作流程仅基于AF2似乎并没有完全捕获从头蛋白设计的基本原理,如在蛋白质表面观察到的疏水性与亲水图案。然而,只需最少的设计后干预,这些管道产生了可行的序列作为评估的实验表征。因此,这样的流水线显示出有助于解决从头蛋白设计中的突出挑战的潜力。本文受版权保护。保留所有权利。
    De novo protein design enhances our understanding of the principles that govern protein folding and interactions, and has the potential to revolutionize biotechnology through the engineering of novel protein functionalities. Despite recent progress in computational design strategies, de novo design of protein structures remains challenging, given the vast size of the sequence-structure space. AlphaFold2 (AF2), a state-of-the-art neural network architecture, achieved remarkable accuracy in predicting protein structures from amino acid sequences. This raises the question whether AF2 has learned the principles of protein folding sufficiently for de novo design. Here, we sought to answer this question by inverting the AF2 network, using the prediction weight set and a loss function to bias the generated sequences to adopt a target fold. Initial design trials resulted in de novo designs with an overrepresentation of hydrophobic residues on the protein surface compared to their natural protein family, requiring additional surface optimization. In silico validation of the designs showed protein structures with the correct fold, a hydrophilic surface and a densely packed hydrophobic core. In vitro validation showed that 7 out of 39 designs were folded and stable in solution with high melting temperatures. In summary, our design workflow solely based on AF2 does not seem to fully capture basic principles of de novo protein design, as observed in the protein surface\'s hydrophobic vs. hydrophilic patterning. However, with minimal post-design intervention, these pipelines generated viable sequences as assessed experimental characterization. Thus, such pipelines show the potential to contribute to solving outstanding challenges in de novo protein design.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    加深对T细胞介导的适应性免疫反应的理解对于设计针对大流行爆发的癌症免疫疗法和抗病毒疫苗很重要。当T细胞识别通过主要组织相容性复合物(MHC)在细胞表面呈递的外源肽时,T细胞被激活。形成肽:MHC(pMHC)复合物。pMHC复合物的3D结构提供了对T细胞识别机制的基本见解,并有助于免疫疗法设计。高MHC和肽多样性需要有效的计算建模以实现整个蛋白质组结构分析。我们开发了PANDORA,pMHCI类和II类(pMHC-I和pMHC-II)的通用建模管道,并在这里展示其在pMHC-I上的表现。给定一个查询,PANDORA在其广泛的数据库中搜索结构模板,然后将锚固约束应用于建模过程。这种受限的能量最小化确保了迄今为止最快的pMHC建模管道之一。在一组超过78种MHC类型的835种pMHC-I复合物上,PANDORA生成的模型的平均RMSD为0.70µ,在前10个模型中的成功率为93%。PANDORA与三种最先进的pMHC-I建模方法具有竞争力,在准确性方面优于AlphaFold2,同时在速度上优于AlphaFold2。PANDORA是一个模块化和用户可配置的python包,易于安装。我们设想PANDORA将为深度学习算法提供大规模高质量3D模型,以应对长期存在的免疫学挑战。
    Deeper understanding of T-cell-mediated adaptive immune responses is important for the design of cancer immunotherapies and antiviral vaccines against pandemic outbreaks. T-cells are activated when they recognize foreign peptides that are presented on the cell surface by Major Histocompatibility Complexes (MHC), forming peptide:MHC (pMHC) complexes. 3D structures of pMHC complexes provide fundamental insight into T-cell recognition mechanism and aids immunotherapy design. High MHC and peptide diversities necessitate efficient computational modelling to enable whole proteome structural analysis. We developed PANDORA, a generic modelling pipeline for pMHC class I and II (pMHC-I and pMHC-II), and present its performance on pMHC-I here. Given a query, PANDORA searches for structural templates in its extensive database and then applies anchor restraints to the modelling process. This restrained energy minimization ensures one of the fastest pMHC modelling pipelines so far. On a set of 835 pMHC-I complexes over 78 MHC types, PANDORA generated models with a median RMSD of 0.70 Å and achieved a 93% success rate in top 10 models. PANDORA performs competitively with three pMHC-I modelling state-of-the-art approaches and outperforms AlphaFold2 in terms of accuracy while being superior to it in speed. PANDORA is a modularized and user-configurable python package with easy installation. We envision PANDORA to fuel deep learning algorithms with large-scale high-quality 3D models to tackle long-standing immunology challenges.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    过敏正在成为世界人口中加剧的疾病,特别是在发达国家。一旦过敏发展,患者永久地被困在一种过度免疫反应中,使他们对无害物质敏感。与发展变态反应有关的免疫途径是Th2免疫途径,其中IgE抗体结合到肥大细胞和嗜碱性粒细胞上的FcβRI受体。本文讨论了一种可能破坏抗体与其受体之间结合的方案,以进行潜在的永久性治疗。计算设计了10种蛋白质,以显示非常接近IgE抗体的FcβRI受体结合位点的人IgE基序,以努力将这些蛋白质用作针对我们自己的IgE抗体的疫苗。感兴趣的基序是FG环基序,将其切下并移植到金黄色葡萄球菌蛋白(PDBID1YN3)上,然后,基序+支架结构在基序周围重新设计其序列,以找到可以正确折叠到设计结构的氨基酸序列。当使用Rosetta的AbinitioRelax折叠模拟进行模拟时,这十种计算设计的蛋白质显示出成功的折叠,并且在所有这些蛋白质中,IgE表位都清楚地显示在其天然三维结构中。这些设计的蛋白质具有用作泛抗过敏疫苗的潜力。这项工作采用硅基方法设计蛋白质,不包括任何实验验证。
    Allergy is becoming an intensifying disease among the world population, particularly in the developed world. Once allergy develops, sufferers are permanently trapped in a hyper-immune response that makes them sensitive to innocuous substances. The immune pathway concerned with developing allergy is the Th2 immune pathway where the IgE antibody binds to its Fc ∊ RI receptor on Mast and Basophil cells. This paper discusses a protocol that could disrupt the binding between the antibody and its receptor for a potential permanent treatment. Ten proteins were computationally designed to display a human IgE motif very close in proximity to the IgE antibody\'s Fc ∊ RI receptor\'s binding site in an effort for these proteins to be used as a vaccine against our own IgE antibody. The motif of interest was the FG loop motif and it was excised and grafted onto a Staphylococcus aureus protein (PDB ID 1YN3), then the motif + scaffold structure had its sequence re-designed around the motif to find an amino acid sequence that would fold to the designed structure correctly. These ten computationally designed proteins showed successful folding when simulated using Rosetta\'s AbinitioRelax folding simulation and the IgE epitope was clearly displayed in its native three-dimensional structure in all of them. These designed proteins have the potential to be used as a pan anti-allergy vaccine. This work employedin silicobased methods for designing the proteins and did not include any experimental verifications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Best vitelliform macular dystrophy (BVMD) is an autosomal dominant macular degeneration. The typical central yellowish yolk-like lesion usually appears in childhood and gradually worsens. Most cases are caused by variants in the BEST1 gene which encodes bestrophin-1, an integral membrane protein found primarily in the retinal pigment epithelium.
    Here we describe the spectrum of BEST1 variants identified in a cohort of 57 Italian patients analyzed by Sanger sequencing. In 13 cases, the study also included segregation analysis in affected and unaffected relatives. We used molecular mechanics to calculate two quantitative parameters related to calcium-activated chloride channel (CaCC composed of 5 BEST1 subunits) stability and calcium-dependent activation and related them to the potential pathogenicity of individual missense variants detected in the probands.
    Thirty-six out of 57 probands (63% positivity) and 16 out of 18 relatives proved positive to genetic testing. Family study confirmed the variable penetrance and expressivity of the disease. Six of the 27 genetic variants discovered were novel: p.(Val9Gly), p.(Ser108Arg), p.(Asn179Asp), p.(Trp182Arg), p.(Glu292Gln) and p.(Asn296Lys). All BEST1 variants were assessed in silico for potential pathogenicity. Our computational structural biology approach based on 3D model structure of the CaCC showed that individual amino acid replacements may affect channel shape, stability, activation, gating, selectivity and throughput, and possibly also other features, depending on where the individual mutated amino acid residues are located in the tertiary structure of BEST1. Statistically significant correlations between mean logMAR best-corrected visual acuity (BCVA), age and modulus of computed BEST1 dimerization energies, which reflect variations in the in CaCC stability due to amino acid changes, permitted us to assess the pathogenicity of individual BEST1 variants.
    Using this computational approach, we designed a method for estimating BCVA progression in patients with BEST1 variants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Bioinformatics is a very resourceful tool to understand evolution of membrane proteins, such as transient receptor potential channels. Expert bioinformatics users rely on specialized scripting and programming skills. Several web servers and standalone tools are available for nonadvanced users willing to develop projects to understand their system of choice. In this case, we present a desktop-based protocol to develop evostructural hypotheses based on basic bioinformatics skills and resources, specifically for a small subgroup of TRPV channels, which can be further implemented for larger datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    低温电子显微镜(cryo-EM)正在成为确定蛋白质结构的首选成像方法。许多原子结构已经基于呈指数增长的数量的公开的三维(3D)高分辨率低温-EM密度图被解析。然而,多年来,重建的3D密度图的分辨率值一直是科学争论的话题。傅里叶壳相关(FSC)是目前公认的低温EM分辨率测量,但它可以是主观的,操纵,并有其自身的局限性。在这项研究中,我们首先提出了有监督的深度学习方法来提取具有代表性的3D特征,模拟蛋白质密度图的中等和低分辨率,并建立分类模型,客观地验证实验3D低温EM图的分辨率。具体来说,我们基于密集人工神经网络(DNN)和3D卷积神经网络(3DCNN)架构构建分类模型。经过训练的模型可以将给定的3D低温EM密度图分类为三个分辨率级别之一:高,中等,低。初步的DNN和3DCNN模型在模拟测试图上实现了92.73%的准确率和99.75%的准确率,分别。将DNN和3DCNN模型应用于30个实验性低温EM图,取得了60.0%和56.7%的一致性,分别,与作者发表的分辨率值的密度图。我们进一步增强了这些先前的技术,并提供了用于局部分辨率分类的3DU-Net模型的初步结果。训练该模型以将3D低温EM密度图按体素分类为十个分辨率类别之一,而不是单个全局分辨率值。在MonoRes和ResMap方法确定的局部分辨率的实验图上评估时,U-Net模型获得了88.3%和94.7%的准确性,分别。我们的研究结果表明,深度学习可以潜在地改善实验冷冻EM图的分辨率评估过程。
    Cryo-electron microscopy (cryo-EM) is becoming the imaging method of choice for determining protein structures. Many atomic structures have been resolved based on an exponentially growing number of published three-dimensional (3D) high resolution cryo-EM density maps. However, the resolution value claimed for the reconstructed 3D density map has been the topic of scientific debate for many years. The Fourier Shell Correlation (FSC) is the currently accepted cryo-EM resolution measure, but it can be subjective, manipulated, and has its own limitations. In this study, we first propose supervised deep learning methods to extract representative 3D features at high, medium and low resolutions from simulated protein density maps and build classification models that objectively validate resolutions of experimental 3D cryo-EM maps. Specifically, we build classification models based on dense artificial neural network (DNN) and 3D convolutional neural network (3D CNN) architectures. The trained models can classify a given 3D cryo-EM density map into one of three resolution levels: high, medium, low. The preliminary DNN and 3D CNN models achieved 92.73% accuracy and 99.75% accuracy on simulated test maps, respectively. Applying the DNN and 3D CNN models to thirty experimental cryo-EM maps achieved an agreement of 60.0% and 56.7%, respectively, with the author published resolution value of the density maps. We further augment these previous techniques and present preliminary results of a 3D U-Net model for local resolution classification. The model was trained to perform voxel-wise classification of 3D cryo-EM density maps into one of ten resolution classes, instead of a single global resolution value. The U-Net model achieved 88.3% and 94.7% accuracy when evaluated on experimental maps with local resolutions determined by MonoRes and ResMap methods, respectively. Our results suggest deep learning can potentially improve the resolution evaluation process of experimental cryo-EM maps.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Chemical crosslinking can identify the neighborhood relationships between specific amino-acid residues in proteins. The interpretation of crosslinking data is typically performed using single, static atomic structures. However, proteins are dynamic, undergoing motions spanning from local fluctuations of individual residues to global motions of protein assemblies. Here we demonstrate that failure to explicitly accommodate dynamics when interpreting crosslinks structurally can lead to considerable errors. We present a method and associated software, DynamXL, which is able to account directly for flexibility in the context of crosslinking modeling. Our benchmarking on a large dataset of model structures demonstrates significantly improved rationalization of experimental crosslinking data, and enhanced performance in a protein-protein docking protocol. These advances will provide a considerable increase in the structural insights attainable using chemical crosslinking coupled to mass spectrometry.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Despite the recent success of newly developed direct-acting antivirals against hepatitis C, the disease continues to be a global health threat due to the lack of diagnosis of most carriers and the high cost of treatment. The heterodimer formed by glycoproteins E1 and E2 within the hepatitis C virus (HCV) lipid envelope is a potential vaccine candidate and antiviral target. While the structure of E1/E2 has not yet been resolved, partial crystal structures of the E1 and E2 ectodomains have been determined. The unresolved parts of the structure are within the realm of what can be modeled with current computational modeling tools. Furthermore, a variety of additional experimental data is available to support computational predictions of E1/E2 structure, such as data from antibody binding studies, cryo-electron microscopy (cryo-EM), mutational analyses, peptide binding analysis, linker-scanning mutagenesis, and nuclear magnetic resonance (NMR) studies. In accordance with these rich experimental data, we have built an in silico model of the full-length E1/E2 heterodimer. Our model supports that E1/E2 assembles into a trimer, which was previously suggested from a study by Falson and coworkers (P. Falson, B. Bartosch, K. Alsaleh, B. A. Tews, A. Loquet, Y. Ciczora, L. Riva, C. Montigny, C. Montpellier, G. Duverlie, E. I. Pecheur, M. le Maire, F. L. Cosset, J. Dubuisson, and F. Penin, J. Virol. 89:10333-10346, 2015, https://doi.org/10.1128/JVI.00991-15). Size exclusion chromatography and Western blotting data obtained by using purified recombinant E1/E2 support our hypothesis. Our model suggests that during virus assembly, the trimer of E1/E2 may be further assembled into a pentamer, with 12 pentamers comprising a single HCV virion. We anticipate that this new model will provide a useful framework for HCV envelope structure and the development of antiviral strategies.IMPORTANCE One hundred fifty million people have been estimated to be infected with hepatitis C virus, and many more are at risk for infection. A better understanding of the structure of the HCV envelope, which is responsible for attachment and fusion, could aid in the development of a vaccine and/or new treatments for this disease. We draw upon computational techniques to predict a full-length model of the E1/E2 heterodimer based on the partial crystal structures of the envelope glycoproteins E1 and E2. E1/E2 has been widely studied experimentally, and this provides valuable data, which has assisted us in our modeling. Our proposed structure is used to suggest the organization of the HCV envelope. We also present new experimental data from size exclusion chromatography that support our computational prediction of a trimeric oligomeric state of E1/E2.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    结构生物学的研究项目越来越依赖于异质信息源的组合,例如,来自多个序列比对的进化信息,来自蛋白质组学实验的密度图和邻近约束形式的实验证据。OpenStructure软件框架,它允许不同来源的信息的无缝集成,以前是介绍过的。该软件由C++库组成,可从Python编程语言完全访问。此外,该框架提供了一个复杂的图形模块,交互式地显示三维分子结构和密度图。在这项工作中,概述了OpenStructure框架的最新发展。将使用短代码示例来说明框架的广泛功能,这些示例显示了来自分子结构坐标的信息如何与序列数据和/或密度图相结合。该框架已在LGPL版本3许可证下发布,可从http://www下载。openstructure.org.
    Research projects in structural biology increasingly rely on combinations of heterogeneous sources of information, e.g. evolutionary information from multiple sequence alignments, experimental evidence in the form of density maps and proximity constraints from proteomics experiments. The OpenStructure software framework, which allows the seamless integration of information of different origin, has previously been introduced. The software consists of C++ libraries which are fully accessible from the Python programming language. Additionally, the framework provides a sophisticated graphics module that interactively displays molecular structures and density maps in three dimensions. In this work, the latest developments in the OpenStructure framework are outlined. The extensive capabilities of the framework will be illustrated using short code examples that show how information from molecular-structure coordinates can be combined with sequence data and/or density maps. The framework has been released under the LGPL version 3 license and is available for download from http://www.openstructure.org.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号