Solvent accessibility

溶剂可及性
  • 文章类型: Journal Article
    鉴于RNA三级结构的实验测定的挑战,探测溶剂可及性对于获得功能洞察力变得越来越重要。在开发的各种化学探针中,主链切割羟基自由基是唯一可以提供所有可接近核苷酸的无偏倚检测的自由基。然而,读数是基于在切割位点停止的逆转录(RT),由于PCR扩增偏差,它们容易出现假阳性,逆转录酶的早期下降,以及随机引物在RT反应中的应用。这里,我们引入了一种称为RL-Seq的固定引物方法,即在高通量测序之前,在固定的5'-OH-末端接头和羟基自由基裂解的独特3'-P-末端片段之间进行RtcB连接(RL)。该方法对大肠杆菌核糖体的应用证实了其在单核苷酸分辨率下以高灵敏度(低要求的测序深度)和准确度(与结构衍生值的强相关性)准确探测溶剂可及性的能力。此外,在使用和不使用独特的分子标识符的实验之间发现了近乎完美的相关性,表明RL-Seq中的PCR偏差可忽略不计。讨论了RL-Seq的进一步改进及其潜在的转录组应用。
    Given the challenges for the experimental determination of RNA tertiary structures, probing solvent accessibility has become increasingly important to gain functional insights. Among various chemical probes developed, backbone-cleaving hydroxyl radical is the only one that can provide unbiased detection of all accessible nucleotides. However, the readouts have been based on reverse transcription (RT) stop at the cleaving sites, which are prone to false positives due to PCR amplification bias, early drop-off of reverse transcriptase, and the use of random primers in RT reaction. Here, we introduced a fixed-primer method called RL-Seq by performing RtcB Ligation (RL) between a fixed 5\'-OH-end linker and unique 3\'-P-end fragments from hydroxyl radical cleavage prior to high-throughput sequencing. The application of this method to E. coli ribosomes confirmed its ability to accurately probe solvent accessibility with high sensitivity (low required sequencing depth) and accuracy (strong correlation to structure-derived values) at the single-nucleotide resolution. Moreover, a near-perfect correlation was found between the experiments with and without using unique molecular identifiers, indicating negligible PCR biases in RL-Seq. Further improvement of RL-Seq and its potential transcriptome-wide applications are discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    表征RNA结构和功能主要集中在2D上,次级和3D,三级结构。用于探测或预测RNA溶剂可及性的实验和计算技术的最新进展使这种三级结构的1D表示成为越来越有吸引力的探索特征。这里,我们提供了对这些最新发展的调查,这表明溶剂可及性的出现是一个简单的一维属性,添加二级和三级结构以研究RNA的复杂结构-功能关系。
    Characterizing RNA structures and functions have mostly been focused on 2D, secondary and 3D, tertiary structures. Recent advances in experimental and computational techniques for probing or predicting RNA solvent accessibility make this 1D representation of tertiary structures an increasingly attractive feature to explore. Here, we provide a survey of these recent developments, which indicate the emergence of solvent accessibility as a simple 1D property, adding to secondary and tertiary structures for investigating complex structure-function relations of RNAs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Superfine pulverisation (SFP) pretreatment of Lycium barbarum L. leaves was performed to obtain highly crystalline cellulose. Compared with other common pulverisation methods, SFP enhanced cellulosic crystallinity by 18.3 % and 8.4 %, with and without post-acid treatments, respectively. XRD and solid-state NMR analyses showed that SFP facilitated the exposure of amorphous substances (i.e., hemicellulose and lignin) to NaOH and H2O2. Large amounts of silicon (5.5 %) and aluminium (2.1 %) were found to incorporate into the crystalline regions of SFP-produced cellulose. Further FTIR and thermogravimetric analyses revealed that SFP-produced cellulose contained large amounts of hydroxyl groups, affecting the cellulosic crystallinity and thermal stability. These findings demonstrate the potential for SFP to serve as a green technology for production of highly crystalline and mineral-rich cellulose.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    从一维(1D)序列直接预测蛋白质的三维(3D)结构是一个具有挑战性的问题。诸如溶剂可及性和接触数量之类的重要结构特征对于在建模蛋白质折叠和蛋白质3D结构中获得约束至关重要。因此,准确预测这些特征是构建3D蛋白质结构的关键步骤。
    在这项研究中,我们介绍DeepSacon,一种计算方法,可以通过使用深度神经网络有效地预测蛋白质溶剂的可及性和接触数量,它是基于堆叠的自动编码器和dropout方法构建的。结果表明,与最先进的方法相比,我们提出的DeepSacon在预测质量方面取得了显着提高。对于溶剂可达性,我们获得了0.70的三态精度,5729单体可溶性球状蛋白数据集上的接触编号的0.3315状态精度和0.74皮尔逊相关系数(PCC)。我们还评估了CASP11基准数据集上的性能,DeepSacon实现了0.68的三态精度和0.69的溶剂可达性和接触号码PCC,分别。
    我们已经表明,DeepSacon可以通过堆叠稀疏自动编码器和dropout方法可靠地预测溶剂可及性和接触数量。
    Direct prediction of the three-dimensional (3D) structures of proteins from one-dimensional (1D) sequences is a challenging problem. Significant structural characteristics such as solvent accessibility and contact number are essential for deriving restrains in modeling protein folding and protein 3D structure. Thus, accurately predicting these features is a critical step for 3D protein structure building.
    In this study, we present DeepSacon, a computational method that can effectively predict protein solvent accessibility and contact number by using a deep neural network, which is built based on stacked autoencoder and a dropout method. The results demonstrate that our proposed DeepSacon achieves a significant improvement in the prediction quality compared with the state-of-the-art methods. We obtain 0.70 three-state accuracy for solvent accessibility, 0.33 15-state accuracy and 0.74 Pearson Correlation Coefficient (PCC) for the contact number on the 5729 monomeric soluble globular protein dataset. We also evaluate the performance on the CASP11 benchmark dataset, DeepSacon achieves 0.68 three-state accuracy and 0.69 PCC for solvent accessibility and contact number, respectively.
    We have shown that DeepSacon can reliably predict solvent accessibility and contact number with stacked sparse autoencoder and a dropout approach.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    As most RNA structures are elusive to structure determination, obtaining solvent accessible surface areas (ASAs) of nucleotides in an RNA structure is an important first step to characterize potential functional sites and core structural regions. Here, we developed RNAsnap, the first machine-learning method trained on protein-bound RNA structures for solvent accessibility prediction. Built on sequence profiles from multiple sequence alignment (RNAsnap-prof), the method provided robust prediction in fivefold cross-validation and an independent test (Pearson correlation coefficients, r, between predicted and actual ASA values are 0.66 and 0.63, respectively). Application of the method to 6178 mRNAs revealed its positive correlation to mRNA accessibility by dimethyl sulphate (DMS) experimentally measured in vivo (r = 0.37) but not in vitro (r = 0.07), despite the lack of training on mRNAs and the fact that DMS accessibility is only an approximation to solvent accessibility. We further found strong association across coding and noncoding regions between predicted solvent accessibility of the mutation site of a single nucleotide variant (SNV) and the frequency of that variant in the population for 2.2 million SNVs obtained in the 1000 Genomes Project. Moreover, mapping solvent accessibility of RNAs to the human genome indicated that introns, 5\' cap of 5\' and 3\' cap of 3\' untranslated regions, are more solvent accessible, consistent with their respective functional roles. These results support conformational selections as the mechanism for the formation of RNA-protein complexes and highlight the utility of genome-scale characterization of RNA tertiary structures by RNAsnap. The server and its stand-alone downloadable version are available at http://sparks-lab.org.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    背景:溶剂可及性的预测可以为分析蛋白质的结构和功能提供有价值的线索,如蛋白质3维结构和B细胞表位预测。为了完全破译蛋白质-蛋白质相互作用的过程,最初但关键的一步是计算蛋白质溶剂的可及性,特别是当蛋白质的三级结构未知时。尽管已经在蛋白质溶剂可及性预测方面做出了一些努力,现有方法的性能远远不能令人满意。
    方法:为了开发高精度模型,我们关注一些关于预测性能的可能方面,包括几个序列派生的特征,加权滑动窗口方案和机器学习方法的参数优化。为了解决上述问题,我们采取以下策略。首先,我们探索了已观察到与残留溶剂可及性相关的各种特征。这些区别特征包括蛋白质进化信息,预测蛋白质二级结构,原生病症,物理化学倾向和几个基于序列的残基结构描述符。其次,观察到滑动窗口中相邻残基的不同贡献,因此,提出了一种加权滑动窗口方案来区分相邻残基对中心残基的贡献。第三,粒子群优化(PSO)用于搜索所提出的预测器的全局最佳参数。
    结果:通过3倍交叉验证评估,对于我们新编制的数据集,我们的方法实现了14.1%的平均绝对误差(MAE)和0.75的人相关系数(PCC).与两个基准数据集中最先进的预测模型相比,我们的方法表现出更好的性能。实验结果表明,我们的PSAP实现了高性能,优于许多现有的预测因子。名为PSAP的Web服务器已构建,可在http://59.73.198.144:8088/SolventAccessibility/免费获得。
    BACKGROUND: The prediction of solvent accessibility could provide valuable clues for analyzing protein structure and functions, such as protein 3-Dimensional structure and B-cell epitope prediction. To fully decipher the protein-protein interaction process, an initial but crucial step is to calculate the protein solvent accessibility, especially when the tertiary structure of the protein is unknown. Although some efforts have been put into the protein solvent accessibility prediction, the performance of existing methods is far from satisfaction.
    METHODS: In order to develop the high-accuracy model, we focus on some possible aspects concerning the prediction performance, including several sequence-derived features, a weighted sliding window scheme and the parameters optimization of machine learning approach. To address above issues, we take following strategies. Firstly, we explore various features which have been observed to be associated with the residue solvent accessibility. These discriminative features include protein evolutionary information, predicted protein secondary structure, native disorder, physicochemical propensities and several sequence-based structural descriptors of residues. Secondly, the different contributions of adjacent residues in sliding window are observed, thus a weighted sliding window scheme is proposed to differentiate the contributions of adjacent residues on the central residue. Thirdly, particle swarm optimization (PSO) is employed to search the global best parameters for the proposed predictor.
    RESULTS: Evaluated by 3-fold cross-validation, our method achieves the mean absolute error (MAE) of 14.1% and the person correlation coefficient (PCC) of 0.75 for our new-compiled dataset. When compared with the state-of-the-art prediction models in the two benchmark datasets, our method demonstrates better performance. Experimental results demonstrate that our PSAP achieves high performances and outperforms many existing predictors. A web server called PSAP is built and freely available at http://59.73.198.144:8088/SolventAccessibility/.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Models of protein evolution tend to ignore functional constraints, although structural constraints are sometimes incorporated. Here we propose a probabilistic framework for codon substitution that evaluates joint effects of relative solvent accessibility (RSA), a structural constraint; and gene expression, a functional constraint. First, we explore the relationship between RSA and codon usage at the genomic scale as well as at the individual gene scale. Motivated by these results, we construct our framework by determining how probable is an amino acid, given RSA and gene expression, and then evaluating the relative probability of observing a codon compared to other synonymous codons. We come to the biologically plausible conclusion that both RSA and gene expression are related to amino acid frequencies, but, among synonymous codons, the relative probability of a particular codon is more closely related to gene expression than RSA. To illustrate the potential applications of our framework, we propose a new codon substitution model. Using this model, we obtain estimates of 2N s, the product of effective population size N, and relative fitness difference of allele s. For a training data set consisting of human proteins with known structures and expression data, 2N s is estimated separately for synonymous and nonsynonymous substitutions in each protein. We then contrast the patterns of synonymous and nonsynonymous 2N s estimates across proteins while also taking gene expression levels of the proteins into account. We conclude that our 2N s estimates are too concentrated around 0, and we discuss potential explanations for this lack of variability.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号