protein–DNA binding

蛋白质 - DNA 结合
  • 文章类型: Journal Article






  • 文章类型: Journal Article
    Protein-DNA and protein-RNA interactions are involved in many biological processes and regulate many cellular functions. Moreover, they are related to many human diseases. To understand the molecular mechanism of protein-DNA binding and protein-RNA binding, it is important to identify which residues in the protein sequence bind to DNA and RNA. At present, there are few methods for specifically identifying the binding sites of disease-related protein-DNA and protein-RNA. In this study, so we combined four machine learning algorithms into an ensemble classifier (EPDRNA) to predict DNA and RNA binding sites in disease-related proteins. The dataset used in model was collated from UniProt and PDB database, and PSSM, physicochemical properties and amino acid type were used as features. The EPDRNA adopted soft voting and achieved the best AUC value of 0.73 at the DNA binding sites, and the best AUC value of 0.71 at the RNA binding sites in 10-fold cross validation in the training sets. In order to further verify the performance of the model, we assessed EPDRNA for the prediction of DNA-binding sites and the prediction of RNA-binding sites on the independent test dataset. The EPDRNA achieved 85% recall rate and 25% precision on the protein-DNA interaction independent test set, and achieved 82% recall rate and 27% precision on the protein-RNA interaction independent test set. The online EPDRNA webserver is freely available at .






  • 文章类型: Journal Article
    Bacterial NAD+-dependent DNA ligases (LigAs) are enzymes involved in replication, recombination, and DNA-repair processes by catalyzing the formation of phosphodiester bonds in the backbone of DNA. These multidomain proteins exhibit four modular domains, that are highly conserved across species, with the BRCT (breast cancer type 1 C-terminus) domain on the C-terminus of the enzyme. In this study, we expressed and purified both recombinant full-length and a C-terminally truncated LigA from Deinococcus radiodurans (DrLigA and DrLigA∆BRCT) and characterized them using biochemical and X-ray crystallography techniques. Using seeds of DrLigA spherulites, we obtained ≤ 100 µm plate crystals of DrLigA∆BRCT. The crystal structure of the truncated protein was obtained at 3.4 Å resolution, revealing DrLigA∆BRCT in a non-adenylated state. Using molecular beacon-based activity assays, we demonstrated that DNA ligation via nick sealing remains unaffected in the truncated DrLigA∆BRCT. However, DNA-binding assays revealed a reduction in the affinity of DrLigA∆BRCT for dsDNA. Thus, we conclude that the flexible BRCT domain, while not critical for DNA nick-joining, plays a role in the DNA binding process, which may be a conserved function of the BRCT domain in LigA-type DNA ligases.






  • 文章类型: Journal Article
    In mammals, de novo methylation of cytosines in DNA CpG sites is performed by DNA methyltransferase Dnmt3a. Changes in the methylation status of CpG islands are critical for gene regulation and for the progression of some cancers. Recently, the potential involvement of DNA G-quadruplexes (G4s) in methylation control has been found. Here, we provide evidence for a link between G4 formation and the function of murine DNA methyltransferase Dnmt3a and its individual domains. As DNA models, we used (i) an isolated G4 formed by oligonucleotide capable of folding into parallel quadruplex and (ii) the same G4 inserted into a double-stranded DNA bearing several CpG sites. Using electrophoretic mobility shift and fluorescence polarization assays, we showed that the Dnmt3a catalytic domain (Dnmt3a-CD), in contrast to regulatory PWWP domain, effectively binds the G4 structure formed in both DNA models. The G4-forming oligonucleotide displaced the DNA substrate from its complex with Dnmt3a-CD, resulting in a dramatic suppression of the enzyme activity. In addition, a direct impact of G4 inserted into the DNA duplex on the methylation of a specific CpG site was revealed. Possible mechanisms of G4-mediated epigenetic regulation may include Dnmt3a sequestration at G4 and/or disruption of Dnmt3a oligomerization on the DNA surface.






  • 文章类型: Journal Article
    The Valley of Sacco River (VSR) (Latium, Italy) is an area with large-scale industrial chemical production that has led over time to significant contamination of soil and groundwater with various industrial pollutants, such as organic pesticides, dioxins, organic solvents, heavy metals, and particularly, volatile organic compounds (VOCs). In the present study, we investigated the potential impact of VOCs on the spermatozoa of healthy young males living in the VSR, given the prevalent presence of several VOCs in the semen of these individuals. To accomplish this, spermiograms were conducted followed by molecular analyses to assess the content of sperm nuclear basic proteins (SNBPs) in addition to the protamine-histone ratio and DNA binding of these proteins. We found drastic alterations in the spermatozoa of these young males living in the VSR. Alterations were seen in sperm morphology, sperm motility, sperm count, and protamine/histone ratios, and included significant reductions in SNBP-DNA binding capacity. Our results provide preliminary indications of a possible correlation between the observed alterations and the presence of specific VOCs.






  • 文章类型: Journal Article
    In-gel footprinting enables the precise identification of protein binding sites on the DNA after separation of free and protein-bound DNA molecules by gel electrophoresis in native conditions and subsequent digestion by the nuclease activity of the 1,10-phenanthroline-copper ion [(OP)2-Cu+] within the gel matrix. Hence, the technique combines the resolving power of protein-DNA complexes in the electrophoretic mobility shift assay (EMSA) with the precision of target site identification by chemical footprinting. This approach is particularly well suited to characterize distinct molecular assemblies in a mixture of protein-DNA complexes and to identify individual binding sites within composite operators, when the concentration-dependent occupation of binding sites, with a different affinity, results in the generation of complexes with a distinct stoichiometry and migration velocity in gel electrophoresis.






  • 文章类型: Journal Article
    Predicting in vivo protein-DNA binding sites is a challenging but pressing task in a variety of fields like drug design and development. Most promoters contain a number of transcription factor (TF) binding sites, but only a small minority has been identified by biochemical experiments that are time-consuming and laborious. To tackle this challenge, many computational methods have been proposed to predict TF binding sites from DNA sequence. Although previous methods have achieved remarkable performance in the prediction of protein-DNA interactions, there is still considerable room for improvement. In this paper, we present a hybrid deep learning framework, termed DeepD2V, for transcription factor binding sites prediction. First, we construct the input matrix with an original DNA sequence and its three kinds of variant sequences, including its inverse, complementary, and complementary inverse sequence. A sliding window of size k with a specific stride is used to obtain its k-mer representation of input sequences. Next, we use word2vec to obtain a pre-trained k-mer word distributed representation model. Finally, the probability of protein-DNA binding is predicted by using the recurrent and convolutional neural network. The experiment results on 50 public ChIP-seq benchmark datasets demonstrate the superior performance and robustness of DeepD2V. Moreover, we verify that the performance of DeepD2V using word2vec-based k-mer distributed representation is better than one-hot encoding, and the integrated framework of both convolutional neural network (CNN) and bidirectional LSTM (bi-LSTM) outperforms CNN or the bi-LSTM model when used alone. The source code of DeepD2V is available at the github repository.







  • 文章类型: Journal Article
    The sequence-dependent structure and deformability of DNA play a major role for binding of proteins and regulation of gene expression. So far, most efforts to model DNA flexibility are based on unimodal harmonic stiffness models at base-pair resolution. However, multimodal behavior due to distinct conformational substates also contributes significantly to the conformational flexibility of DNA. Moreover, these local substates are correlated to their nearest-neighbor substates. A description for DNA elasticity which includes both multimodality and nearest-neighbor coupling has remained a challenge, which we solve by combining our multivariate harmonic approximation with an Ising model for the substates. In a series of applications to DNA fluctuations and protein-DNA complexes, we demonstrate substantial improvements over the unimodal stiffness model. Furthermore, our multivariate Ising model reveals a mechanical destabilization for adenine (A)-tracts to undergo nucleosome formation. Our approach offers a wide range of applications to determine sequence-dependent deformation energies of DNA and to investigate indirect readout contributions to protein-DNA recognition.






  • 文章类型: Journal Article
    Protein hydrogen/deuterium exchange (HDX) coupled to mass spectrometry (MS) can be used to study interactions of proteins with various ligands, to describe the effects of mutations, or to reveal structural responses of proteins to different experimental conditions. It is often described as a method with virtually no limitations in terms of protein size or sample composition. While this is generally true, there are, however, ligands or buffer components that can significantly complicate the analysis. One such compound, that can make HDX-MS troublesome, is DNA. In this chapter, we will focus on the analysis of protein-DNA interactions, describe the detailed protocol, and point out ways to overcome the complications arising from the presence of DNA.






  • 文章类型: Journal Article
    DNA mismatch repair (MMR) plays a crucial role in the maintenance of genomic stability. The main MMR protein, MutS, was recently shown to recognize the G-quadruplex (G4) DNA structures, which, along with regulatory functions, have a negative impact on genome integrity. Here, we studied the effect of G4 on the DNA-binding activity of MutS from Rhodobacter sphaeroides (methyl-independent MMR) in comparison with MutS from Escherichia coli (methyl-directed MMR) and evaluated the influence of a G4 on the functioning of other proteins involved in the initial steps of MMR. For this purpose, a new DNA construct was designed containing a biologically relevant intramolecular stable G4 structure flanked by double-stranded regions with the set of DNA sites required for MMR initiation. The secondary structure of this model was examined using NMR spectroscopy, chemical probing, fluorescent indicators, circular dichroism, and UV spectroscopy. The results unambiguously showed that the d(GGGT)4 motif, when embedded in a double-stranded context, adopts a G4 structure of a parallel topology. Despite strong binding affinities of MutS and MutL for a G4, the latter is not recognized by E. coli MMR as a signal for repair, but does not prevent MMR processing when a G4 and G/T mismatch are in close proximity.






