protein–DNA binding

蛋白质 - DNA 结合
  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    蛋白质-DNA和蛋白质-RNA相互作用参与许多生物过程并调节许多细胞功能。此外,它们与许多人类疾病有关。了解蛋白质-DNA结合和蛋白质-RNA结合的分子机制,鉴定蛋白质序列中的哪些残基与DNA和RNA结合是很重要的。目前,特异性鉴定疾病相关蛋白-DNA和蛋白-RNA结合位点的方法很少。在这项研究中,因此,我们将四种机器学习算法结合到集成分类器(EPDRNA)中,以预测疾病相关蛋白质中的DNA和RNA结合位点。模型中使用的数据集来自UniProt和PDB数据库,和PSSM,以理化性质和氨基酸类型为特征。EPDRNA采用软投票,在DNA结合位点获得了0.73的最佳AUC值,在训练集中的10倍交叉验证中,RNA结合位点的最佳AUC值为0.71。为了进一步验证模型的性能,我们在独立的测试数据集上评估了EPDRNA对DNA结合位点的预测和RNA结合位点的预测.EPDRNA在蛋白质-DNA相互作用独立测试集上实现了85%的召回率和25%的精确度,在蛋白质-RNA相互作用独立测试集上,达到82%的召回率和27%的准确率。在线EPDRNA网络服务器可在http://www上免费获得。s-生物信息学。cn/epdrna.
    Protein-DNA and protein-RNA interactions are involved in many biological processes and regulate many cellular functions. Moreover, they are related to many human diseases. To understand the molecular mechanism of protein-DNA binding and protein-RNA binding, it is important to identify which residues in the protein sequence bind to DNA and RNA. At present, there are few methods for specifically identifying the binding sites of disease-related protein-DNA and protein-RNA. In this study, so we combined four machine learning algorithms into an ensemble classifier (EPDRNA) to predict DNA and RNA binding sites in disease-related proteins. The dataset used in model was collated from UniProt and PDB database, and PSSM, physicochemical properties and amino acid type were used as features. The EPDRNA adopted soft voting and achieved the best AUC value of 0.73 at the DNA binding sites, and the best AUC value of 0.71 at the RNA binding sites in 10-fold cross validation in the training sets. In order to further verify the performance of the model, we assessed EPDRNA for the prediction of DNA-binding sites and the prediction of RNA-binding sites on the independent test dataset. The EPDRNA achieved 85% recall rate and 25% precision on the protein-DNA interaction independent test set, and achieved 82% recall rate and 27% precision on the protein-RNA interaction independent test set. The online EPDRNA webserver is freely available at http://www.s-bioinformatics.cn/epdrna .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    细菌NAD+依赖性DNA连接酶(配体)是参与复制的酶,重组,和DNA修复过程是通过催化DNA骨架中磷酸二酯键的形成。这些多域蛋白表现出四个模块域,在物种中高度保守,BRCT(乳腺癌1型C端)结构域位于酶的C端。在这项研究中,我们表达并纯化了来自耐辐射球菌的重组全长和C末端截短的LigA(DrLigA和DrLigAΔBRCT),并使用生化和X射线晶体学技术对其进行了表征。使用DrLigA球晶的种子,我们获得了≤100µm的DrLigAΔBRCT平板晶体。截短蛋白质的晶体结构以3.4µ分辨率获得,显示DrLigAΔBRCT处于非腺苷酸化状态。使用基于分子信标的活性测定,我们证明了通过切口密封进行的DNA连接在截短的DrLigAΔBRCT中仍然不受影响。然而,DNA结合测定显示,DrLigAΔBRCT对dsDNA的亲和力降低。因此,我们得出结论,灵活的BRCT域,虽然对DNA缺口连接并不重要,在DNA结合过程中发挥作用,这可能是配体A型DNA连接酶中BRCT结构域的保守功能。
    Bacterial NAD+-dependent DNA ligases (LigAs) are enzymes involved in replication, recombination, and DNA-repair processes by catalyzing the formation of phosphodiester bonds in the backbone of DNA. These multidomain proteins exhibit four modular domains, that are highly conserved across species, with the BRCT (breast cancer type 1 C-terminus) domain on the C-terminus of the enzyme. In this study, we expressed and purified both recombinant full-length and a C-terminally truncated LigA from Deinococcus radiodurans (DrLigA and DrLigA∆BRCT) and characterized them using biochemical and X-ray crystallography techniques. Using seeds of DrLigA spherulites, we obtained ≤ 100 µm plate crystals of DrLigA∆BRCT. The crystal structure of the truncated protein was obtained at 3.4 Å resolution, revealing DrLigA∆BRCT in a non-adenylated state. Using molecular beacon-based activity assays, we demonstrated that DNA ligation via nick sealing remains unaffected in the truncated DrLigA∆BRCT. However, DNA-binding assays revealed a reduction in the affinity of DrLigA∆BRCT for dsDNA. Thus, we conclude that the flexible BRCT domain, while not critical for DNA nick-joining, plays a role in the DNA binding process, which may be a conserved function of the BRCT domain in LigA-type DNA ligases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在哺乳动物中,DNACpG位点中胞嘧啶的从头甲基化通过DNA甲基转移酶Dnmt3a进行。CpG岛甲基化状态的变化对于基因调控和某些癌症的进展至关重要。最近,已发现DNAG-四体(G4s)可能参与甲基化控制.这里,我们为G4形成与鼠DNA甲基转移酶Dnmt3a及其单个结构域的功能之间的联系提供了证据。作为DNA模型,我们使用(i)由能够折叠成平行四链体的寡核苷酸形成的分离的G4和(ii)插入带有几个CpG位点的双链DNA中的相同G4。使用电泳迁移率偏移和荧光偏振测定,我们表明Dnmt3a催化结构域(Dnmt3a-CD),与监管PWWP域相比,有效结合在两个DNA模型中形成的G4结构。G4形成寡核苷酸从其与Dnmt3a-CD的复合物中置换DNA底物,导致酶活性的戏剧性抑制。此外,揭示了插入DNA双链体中的G4对特定CpG位点甲基化的直接影响.G4介导的表观遗传调控的可能机制可包括Dnmt3a在G4处的隔离和/或DNA表面上的Dnmt3a寡聚化的破坏。
    In mammals, de novo methylation of cytosines in DNA CpG sites is performed by DNA methyltransferase Dnmt3a. Changes in the methylation status of CpG islands are critical for gene regulation and for the progression of some cancers. Recently, the potential involvement of DNA G-quadruplexes (G4s) in methylation control has been found. Here, we provide evidence for a link between G4 formation and the function of murine DNA methyltransferase Dnmt3a and its individual domains. As DNA models, we used (i) an isolated G4 formed by oligonucleotide capable of folding into parallel quadruplex and (ii) the same G4 inserted into a double-stranded DNA bearing several CpG sites. Using electrophoretic mobility shift and fluorescence polarization assays, we showed that the Dnmt3a catalytic domain (Dnmt3a-CD), in contrast to regulatory PWWP domain, effectively binds the G4 structure formed in both DNA models. The G4-forming oligonucleotide displaced the DNA substrate from its complex with Dnmt3a-CD, resulting in a dramatic suppression of the enzyme activity. In addition, a direct impact of G4 inserted into the DNA duplex on the methylation of a specific CpG site was revealed. Possible mechanisms of G4-mediated epigenetic regulation may include Dnmt3a sequestration at G4 and/or disruption of Dnmt3a oligomerization on the DNA surface.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    萨科河谷(VSR)(Latium,意大利)是一个大规模工业化学生产的地区,随着时间的推移,土壤和地下水被各种工业污染物严重污染,如有机农药,二恶英,有机溶剂,重金属,尤其是,挥发性有机化合物(VOCs)。在本研究中,我们调查了VOCs对生活在VSR中的健康年轻男性精子的潜在影响,鉴于这些个体的精液中普遍存在几种挥发性有机化合物。要做到这一点,进行精子图,然后进行分子分析,以评估精子核碱性蛋白(SNBPs)的含量,以及鱼精蛋白-组蛋白比率和这些蛋白质的DNA结合。我们发现生活在VSR中的这些年轻男性的精子发生了剧烈变化。精子形态改变,精子运动性,精子计数,鱼精蛋白/组蛋白的比例,并且包括SNBP-DNA结合能力的显著降低。我们的结果提供了观察到的变化与特定VOC存在之间可能存在相关性的初步迹象。
    The Valley of Sacco River (VSR) (Latium, Italy) is an area with large-scale industrial chemical production that has led over time to significant contamination of soil and groundwater with various industrial pollutants, such as organic pesticides, dioxins, organic solvents, heavy metals, and particularly, volatile organic compounds (VOCs). In the present study, we investigated the potential impact of VOCs on the spermatozoa of healthy young males living in the VSR, given the prevalent presence of several VOCs in the semen of these individuals. To accomplish this, spermiograms were conducted followed by molecular analyses to assess the content of sperm nuclear basic proteins (SNBPs) in addition to the protamine-histone ratio and DNA binding of these proteins. We found drastic alterations in the spermatozoa of these young males living in the VSR. Alterations were seen in sperm morphology, sperm motility, sperm count, and protamine/histone ratios, and included significant reductions in SNBP-DNA binding capacity. Our results provide preliminary indications of a possible correlation between the observed alterations and the presence of specific VOCs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    凝胶内足迹法能够在天然条件下通过凝胶电泳分离游离和蛋白质结合的DNA分子并随后通过凝胶基质内的1,10-菲咯啉-铜离子[(OP)2-Cu]的核酸酶活性消化后,精确鉴定DNA上的蛋白质结合位点。因此,该技术结合了蛋白质-DNA复合物在电泳迁移率变化分析(EMSA)中的分辨能力和通过化学足迹法进行靶位点识别的准确性。这种方法特别适合于表征蛋白质-DNA复合物混合物中不同的分子组装体,并识别复合算子中的单个结合位点。当结合位点的浓度依赖性占用时,具有不同的亲和力,结果在凝胶电泳中产生具有不同化学计量和迁移速度的复合物。
    In-gel footprinting enables the precise identification of protein binding sites on the DNA after separation of free and protein-bound DNA molecules by gel electrophoresis in native conditions and subsequent digestion by the nuclease activity of the 1,10-phenanthroline-copper ion [(OP)2-Cu+] within the gel matrix. Hence, the technique combines the resolving power of protein-DNA complexes in the electrophoretic mobility shift assay (EMSA) with the precision of target site identification by chemical footprinting. This approach is particularly well suited to characterize distinct molecular assemblies in a mixture of protein-DNA complexes and to identify individual binding sites within composite operators, when the concentration-dependent occupation of binding sites, with a different affinity, results in the generation of complexes with a distinct stoichiometry and migration velocity in gel electrophoresis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在药物设计和开发等各个领域,预测体内蛋白质-DNA结合位点是一项具有挑战性但紧迫的任务。大多数启动子含有许多转录因子(TF)结合位点,但是只有一小部分被耗时费力的生化实验发现。为了应对这一挑战,已经提出了许多计算方法来从DNA序列中预测TF结合位点。尽管以前的方法在预测蛋白质-DNA相互作用方面取得了显著的性能,仍有相当大的改进空间。在本文中,我们提出了一个混合深度学习框架,称为深度D2V,用于转录因子结合位点预测。首先,用原始DNA序列及其三种变异序列构建输入矩阵,包括它的逆,互补,和互补的逆序列。使用具有特定步幅的大小为k的滑动窗口来获得其输入序列的k聚体表示。接下来,我们使用word2vec来获得预训练的k-mer单词分布式表示模型。最后,利用递归和卷积神经网络预测蛋白质-DNA结合的概率。在50个公共ChIP-seq基准数据集上的实验结果证明了DeepD2V的优越性能和鲁棒性。此外,我们验证了使用基于word2vec的k-mer分布式表示的DeepD2V的性能优于单热编码,卷积神经网络(CNN)和双向LSTM(bi-LSTM)的集成框架在单独使用时优于CNN或bi-LSTM模型。DeepD2V的源代码可在github存储库中获得。
    Predicting in vivo protein-DNA binding sites is a challenging but pressing task in a variety of fields like drug design and development. Most promoters contain a number of transcription factor (TF) binding sites, but only a small minority has been identified by biochemical experiments that are time-consuming and laborious. To tackle this challenge, many computational methods have been proposed to predict TF binding sites from DNA sequence. Although previous methods have achieved remarkable performance in the prediction of protein-DNA interactions, there is still considerable room for improvement. In this paper, we present a hybrid deep learning framework, termed DeepD2V, for transcription factor binding sites prediction. First, we construct the input matrix with an original DNA sequence and its three kinds of variant sequences, including its inverse, complementary, and complementary inverse sequence. A sliding window of size k with a specific stride is used to obtain its k-mer representation of input sequences. Next, we use word2vec to obtain a pre-trained k-mer word distributed representation model. Finally, the probability of protein-DNA binding is predicted by using the recurrent and convolutional neural network. The experiment results on 50 public ChIP-seq benchmark datasets demonstrate the superior performance and robustness of DeepD2V. Moreover, we verify that the performance of DeepD2V using word2vec-based k-mer distributed representation is better than one-hot encoding, and the integrated framework of both convolutional neural network (CNN) and bidirectional LSTM (bi-LSTM) outperforms CNN or the bi-LSTM model when used alone. The source code of DeepD2V is available at the github repository.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    DNA的序列依赖性结构和变形性对于蛋白质的结合和基因表达的调节起着重要作用。到目前为止,对DNA灵活性建模的大多数努力都基于基对分辨率下的单峰谐波刚度模型。然而,由于不同的构象子状态,多峰行为也显着有助于DNA的构象灵活性。此外,这些本地子状态与它们的最近邻子状态相关。对包括多模态和最近邻耦合在内的DNA弹性的描述仍然是一个挑战,我们通过将我们的多变量谐波近似与子状态的伊辛模型相结合来解决这个问题。在DNA波动和蛋白质-DNA复合物的一系列应用中,我们证明了对单峰刚度模型的实质性改进。此外,我们的多变量伊辛模型揭示了腺嘌呤(A)束发生核小体形成的机械不稳定。我们的方法提供了广泛的应用,可以确定DNA的序列相关变形能,并研究间接读出对蛋白质-DNA识别的贡献。
    The sequence-dependent structure and deformability of DNA play a major role for binding of proteins and regulation of gene expression. So far, most efforts to model DNA flexibility are based on unimodal harmonic stiffness models at base-pair resolution. However, multimodal behavior due to distinct conformational substates also contributes significantly to the conformational flexibility of DNA. Moreover, these local substates are correlated to their nearest-neighbor substates. A description for DNA elasticity which includes both multimodality and nearest-neighbor coupling has remained a challenge, which we solve by combining our multivariate harmonic approximation with an Ising model for the substates. In a series of applications to DNA fluctuations and protein-DNA complexes, we demonstrate substantial improvements over the unimodal stiffness model. Furthermore, our multivariate Ising model reveals a mechanical destabilization for adenine (A)-tracts to undergo nucleosome formation. Our approach offers a wide range of applications to determine sequence-dependent deformation energies of DNA and to investigate indirect readout contributions to protein-DNA recognition.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Protein hydrogen/deuterium exchange (HDX) coupled to mass spectrometry (MS) can be used to study interactions of proteins with various ligands, to describe the effects of mutations, or to reveal structural responses of proteins to different experimental conditions. It is often described as a method with virtually no limitations in terms of protein size or sample composition. While this is generally true, there are, however, ligands or buffer components that can significantly complicate the analysis. One such compound, that can make HDX-MS troublesome, is DNA. In this chapter, we will focus on the analysis of protein-DNA interactions, describe the detailed protocol, and point out ways to overcome the complications arising from the presence of DNA.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    DNA mismatch repair (MMR) plays a crucial role in the maintenance of genomic stability. The main MMR protein, MutS, was recently shown to recognize the G-quadruplex (G4) DNA structures, which, along with regulatory functions, have a negative impact on genome integrity. Here, we studied the effect of G4 on the DNA-binding activity of MutS from Rhodobacter sphaeroides (methyl-independent MMR) in comparison with MutS from Escherichia coli (methyl-directed MMR) and evaluated the influence of a G4 on the functioning of other proteins involved in the initial steps of MMR. For this purpose, a new DNA construct was designed containing a biologically relevant intramolecular stable G4 structure flanked by double-stranded regions with the set of DNA sites required for MMR initiation. The secondary structure of this model was examined using NMR spectroscopy, chemical probing, fluorescent indicators, circular dichroism, and UV spectroscopy. The results unambiguously showed that the d(GGGT)4 motif, when embedded in a double-stranded context, adopts a G4 structure of a parallel topology. Despite strong binding affinities of MutS and MutL for a G4, the latter is not recognized by E. coli MMR as a signal for repair, but does not prevent MMR processing when a G4 and G/T mismatch are in close proximity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号