关键词: ClinVar FoldX GnomAD Short linear motifs (SLiMs) Single amino acid substitution (SAS) Single nucleotide variants (SNV)

Mesh : Amino Acid Motifs Amino Acid Sequence Computational Biology / methods Genomics Nucleotides

来  源:   DOI:10.1016/j.biochi.2022.02.002

Abstract:
Short linear motifs (SLiMs) are key to cell physiology mediating reversible protein-protein interactions. Precise identification of SLiMs remains a challenge, being the main drawback of most bioinformatic prediction tools, their low specificity (high number of false positives). An important, usually overlooked, aspect is the relation between SLiMs mutations and disease. The presence of variants in each residue position can be used to assess the relevance of the corresponding residue(s) for protein function, and its (in)tolerance to change. In the present work, we combined sequence variant information and structural analysis of the energetic impact of single amino acid substitution (SAS) in SLiM-Receptor complex structure, and showed that it improves prediction of true functional SLiMs. Our strategy is based on building a SAS tolerance matrix that shows, for each position, whether one of the possible 19 SAS is tolerated or not. Herein we present the MotSASi strategy and analyze in detail 3 SLiMs involved in intracellular protein trafficking (phospho-independent tyrosine-based motif (NPx[Y/F]), type 1 PDZ-binding motif ([S/T]x[V/I/L]COOH) and tryptophan-acidic motif ([L/M]xW[D/E])). Our results show that inclusion of variant and structure information improves both prediction of true SLiMs and rejection of false positives, while also allowing better classification of variants inside SLiMs, a result with a direct impact in clinical genomics.
摘要:
短线性基序(SLiMs)是细胞生理学介导可逆蛋白质-蛋白质相互作用的关键。精确识别SLiM仍然是一个挑战,作为大多数生物信息学预测工具的主要缺点,它们的低特异性(高数量的假阳性)。一个重要的,通常被忽视,方面是SLiMs突变与疾病之间的关系。每个残基位置中变体的存在可用于评估相应残基与蛋白质功能的相关性。以及它对变化的容忍度。在目前的工作中,我们结合序列变异信息和结构分析单氨基酸取代(SAS)在SLiM-受体复合物结构的能量影响,并表明它可以改善对真实功能SLiMs的预测。我们的策略基于构建SAS容忍度矩阵,对于每个位置,是否可以容忍19种SAS中的一种。在这里,我们提出了MotSASi策略,并详细分析了参与细胞内蛋白质运输的3个SLiMs(不依赖磷酸的酪氨酸基序(NPx[Y/F]),1型PDZ结合基序([S/T]x[V/I/L]COOH)和色氨酸酸性基序([L/M]xW[D/E])。我们的结果表明,包含变体和结构信息可以改善对真实SLiM的预测和对假阳性的拒绝,同时还可以更好地对SLiM内部的变体进行分类,对临床基因组学有直接影响的结果。
公众号