protein quaternary structure

  • 文章类型: Journal Article
    The quality prediction of quaternary structure models of a protein complex, in the absence of its true structure, is known as the Estimation of Model Accuracy (EMA). EMA is useful for ranking predicted protein complex structures and using them appropriately in biomedical research, such as protein-protein interaction studies, protein design, and drug discovery. With the advent of more accurate protein complex (multimer) prediction tools, such as AlphaFold2-Multimer and ESMFold, the estimation of the accuracy of protein complex structures has attracted increasing attention. Many deep learning methods have been developed to tackle this problem; however, there is a noticeable absence of a comprehensive overview of these methods to facilitate future development. Addressing this gap, we present a review of deep learning EMA methods for protein complex structures developed in the past several years, analyzing their methodologies, data and feature construction. We also provide a prospective summary of some potential new developments for further improving the accuracy of the EMA methods.






  • 文章类型: Journal Article
    Förster resonance energy transfer (FRET) spectrometry is a method for determining the quaternary structure of protein oligomers from distributions of FRET efficiencies that are drawn from pixels of fluorescence images of cells expressing the proteins of interest. FRET spectrometry protocols currently rely on obtaining spectrally resolved fluorescence data from intensity-based experiments. Another imaging method, fluorescence lifetime imaging microscopy (FLIM), is a widely used alternative to compute FRET efficiencies for each pixel in an image from the reduction of the fluorescence lifetime of the donors caused by FRET. In FLIM studies of oligomers with different proportions of donors and acceptors, the donor lifetimes may be obtained by fitting the temporally resolved fluorescence decay data with a predetermined number of exponential decay curves. However, this requires knowledge of the number and the relative arrangement of the fluorescent proteins in the sample, which is precisely the goal of FRET spectrometry, thus creating a conundrum that has prevented users of FLIM instruments from performing FRET spectrometry. Here, we describe an attempt to implement FRET spectrometry on temporally resolved fluorescence microscopes by using an integration-based method of computing the FRET efficiency from fluorescence decay curves. This method, which we dubbed time-integrated FRET (or tiFRET), was tested on oligomeric fluorescent protein constructs expressed in the cytoplasm of living cells. The present results show that tiFRET is a promising way of implementing FRET spectrometry and suggest potential instrument adjustments for increasing accuracy and resolution in this kind of study.






  • 文章类型: Journal Article
    The identification of physiologically relevant quaternary structures (QSs) in crystal lattices is challenging. To predict the physiological relevance of a particular QS, QSalign searches for homologous structures in which subunits interact in the same geometry. This approach proved accurate but was limited to structures already present in the Protein Data Bank (PDB). Here, we introduce a webserver ( allowing users to submit homo-oligomeric structures of their choice to the QSalign pipeline. Given a user-uploaded structure, the sequence is extracted and used to search homologs based on sequence similarity and PFAM domain architecture. If structural conservation is detected between a homolog and the user-uploaded QS, physiological relevance is inferred. The web server also generates alternative QSs with PISA and processes them the same way as the query submitted to widen the predictions. The result page also shows representative QSs in the protein family of the query, which is informative if no QS conservation was detected or if the protein appears monomeric. These representative QSs can also serve as a starting point for homology modeling.






  • 文章类型: Journal Article
    An accurate understanding of biomolecular mechanisms and diseases requires information on protein quaternary structure (QS). A critical challenge in inferring QS information from crystallography data is distinguishing biological interfaces from fortuitous crystal-packing contacts. Here, we employ QS conservation across homologs to infer the biological relevance of hetero-oligomers. We compare the structures and compositions of hetero-oligomers, which allow us to annotate 7,810 complexes as physiologically relevant, 1,060 as likely errors, and 1,432 with comparative information on subunit stoichiometry and composition. Excluding immunoglobulins, these annotations encompass over 51% of hetero-oligomers in the PDB. We curate a dataset of 577 hetero-oligomeric complexes to benchmark these annotations, which reveals an accuracy >94%. When homology information is not available, we compare QS across repositories (PDB, PISA, and EPPIC) to derive confidence estimates. This work provides high-quality annotations along with a large benchmark dataset of hetero-assemblies.






  • 文章类型: Journal Article
    The structure and the RNA-binding properties of the Lsm protein from Halobacterium salinarum have been determined. A distinctive feature of this protein is the presence of a short L4 loop connecting the β3 and β4 strands. Since bacterial Lsm proteins (also called Hfq proteins) have a short L4 loop and form hexamers, whereas archaeal Lsm proteins (SmAP) have a long L4 loop and form heptamers, it has been suggested that the length of the L4 loop may affect the quaternary structure of Lsm proteins. Moreover, the L4 loop covers the region of SmAP corresponding to one of the RNA-binding sites in Hfq, and thus can affect the RNA-binding properties of the protein. Our results show that the SmAP from H. salinarum forms heptamers and possesses the same RNA-binding properties as homologous proteins with the long L4 loop. Therefore, the length of the L4 does not govern the number of monomers in the protein particles and does not affect the RNA-binding properties of Lsm proteins.






  • 文章类型: Journal Article
    HupZ is an expected heme degrading enzyme in the heme acquisition and utilization pathway in Group A Streptococcus. The isolated HupZ protein containing a C-terminal V5-His6 tag exhibits a weak heme degradation activity. Here, we revisited and characterized the HupZ-V5-His6 protein via biochemical, mutagenesis, protein quaternary structure, UV-vis, EPR, and resonance Raman spectroscopies. The results show that the ferric heme-protein complex did not display an expected ferric EPR signal and that heme binding to HupZ triggered the formation of higher oligomeric states. We found that heme binding to HupZ was an O2-dependent process. The single histidine residue in the HupZ sequence, His111, did not bind to the ferric heme, nor was it involved with the weak heme-degradation activity. Our results do not favor the heme oxygenase assignment because of the slow binding of heme and the newly discovered association of the weak heme degradation activity with the His6-tag. Altogether, the data suggest that the protein binds heme by its His6-tag, resulting in a heme-induced higher-order oligomeric structure and heme stacking. This work emphasizes the importance of considering exogenous tags when interpreting experimental observations during the study of heme utilization proteins.







  • 文章类型: Journal Article
    BACKGROUND: The information of quaternary structure attributes of proteins is very important because it is closely related to the biological functions of proteins. With the rapid development of new generation sequencing technology, we are facing a challenge: how to automatically identify the four-level attributes of new polypeptide chains according to their sequence information (i.e., whether they are formed as just as a monomer, or as a hetero-oligomer, or a homo-oligomer).
    OBJECTIVE: In this article, our goal is to find a new way to represent protein sequences, thereby improving the prediction rate of protein quaternary structure.
    METHODS: In this article, we developed a prediction system for protein quaternary structural type in which a protein sequence was expressed by combining the Pfam functional-domain and gene ontology. turn protein features into digital sequences, and complete the prediction of quaternary structure through specific machine learning algorithms and verification algorithm.
    RESULTS: Our data set contains 5495 protein samples. Through the method provided in this paper, we classify proteins into monomer, or as a hetero-oligomer, or a homo-oligomer, and the prediction rate is 74.38%, which is 3.24% higher than that of previous studies. Through this new feature extraction method, we can further classify the four-level structure of proteins, and the results are also correspondingly improved.
    CONCLUSIONS: After the applying the new prediction system, compared with the previous results, we have successfully improved the prediction rate. We have reason to believe that the feature extraction method in this paper has better practicability and can be used as a reference for other protein classification problems.






  • 文章类型: Journal Article
    A precise knowledge of the quaternary structure of proteins is essential to illuminate both their function and their evolution. The major part of our knowledge on quaternary structure is inferred from X-ray crystallography data, but this inference process is hard and error-prone. The difficulty lies in discriminating fortuitous protein contacts, which make up the lattice of protein crystals, from biological protein contacts that exist in the native cellular environment. Here, we review methods devised to discriminate between both types of contacts and describe resources for downloading protein quaternary structure information and identifying high-confidence quaternary structures. The use of high-confidence datasets of quaternary structures will be critical for the analysis of structural, functional, and evolutionary properties of proteins.






  • 文章类型: Journal Article
    Protein quaternary structure complex is also known as a multimer, which plays an important role in a cell. The dimer structure of transcription factors is involved in gene regulation, but the trimer structure of virus-infection-associated glycoproteins is related to the human immunodeficiency virus. The classification of the protein quaternary structure complex for the post-genome era of proteomics research will be of great help. Classification systems among protein quaternary structures have not been widely developed. Therefore, we designed the architecture of a two-layer machine learning technique in this study, and developed the classification system PClass. The protein quaternary structure of the complex is divided into five categories, namely, monomer, dimer, trimer, tetramer, and other subunit classes. In the framework of the bootstrap method with a support vector machine, we propose a new model selection method. Each type of complex is classified based on sequences, entropy, and accessible surface area, thereby generating a plurality of feature modules. Subsequently, the optimal model of effectiveness is selected as each kind of complex feature module. In this stage, the optimal performance can reach as high as 70% of Matthews correlation coefficient (MCC). The second layer of construction combines the first-layer module to integrate mechanisms and the use of six machine learning methods to improve the prediction performance. This system can be improved over 10% in MCC. Finally, we analyzed the performance of our classification system using transcription factors in dimer structure and virus-infection-associated glycoprotein in trimer structure. PClass is available via a web interface at






  • 文章类型: Journal Article
    The mapping between biological genotypes and phenotypes is central to the study of biological evolution. Here, we introduce a rich, intuitive and biologically realistic genotype-phenotype (GP) map that serves as a model of self-assembling biological structures, such as protein complexes, and remains computationally and analytically tractable. Our GP map arises naturally from the self-assembly of polyomino structures on a two-dimensional lattice and exhibits a number of properties: redundancy (genotypes vastly outnumber phenotypes), phenotype bias (genotypic redundancy varies greatly between phenotypes), genotype component disconnectivity (phenotypes consist of disconnected mutational networks) and shape space covering (most phenotypes can be reached in a small number of mutations). We also show that the mutational robustness of phenotypes scales very roughly logarithmically with phenotype redundancy and is positively correlated with phenotypic evolvability. Although our GP map describes the assembly of disconnected objects, it shares many properties with other popular GP maps for connected units, such as models for RNA secondary structure or the hydrophobic-polar (HP) lattice model for protein tertiary structure. The remarkable fact that these important properties similarly emerge from such different models suggests the possibility that universal features underlie a much wider class of biologically realistic GP maps.






