关键词: artificial intelligence models deep learning models protein conformation analysis protein folding estimation protein structure prediction root-mean-square deviation

来  源:   DOI:10.1515/jib-2023-0041

Abstract:
Protein structure determination has made progress with the aid of deep learning models, enabling the prediction of protein folding from protein sequences. However, obtaining accurate predictions becomes essential in certain cases where the protein structure remains undescribed. This is particularly challenging when dealing with rare, diverse structures and complex sample preparation. Different metrics assess prediction reliability and offer insights into result strength, providing a comprehensive understanding of protein structure by combining different models. In a previous study, two proteins named ARM58 and ARM56 were investigated. These proteins contain four domains of unknown function and are present in Leishmania spp. ARM refers to an antimony resistance marker. The study\'s main objective is to assess the accuracy of the model\'s predictions, thereby providing insights into the complexities and supporting metrics underlying these findings. The analysis also extends to the comparison of predictions obtained from other species and organisms. Notably, one of these proteins shares an ortholog with Trypanosoma cruzi and Trypanosoma brucei, leading further significance to our analysis. This attempt underscored the importance of evaluating the diverse outputs from deep learning models, facilitating comparisons across different organisms and proteins. This becomes particularly pertinent in cases where no previous structural information is available.
摘要:
蛋白质结构测定在深度学习模型的帮助下取得了进展,能够从蛋白质序列中预测蛋白质折叠。然而,在蛋白质结构仍未描述的某些情况下,获得准确的预测变得至关重要。这在处理稀有时尤其具有挑战性,多样的结构和复杂的样品制备。不同的指标评估预测可靠性,并提供对结果强度的洞察,通过结合不同的模型,提供对蛋白质结构的全面了解。在之前的研究中,研究了两种名为ARM58和ARM56的蛋白质。这些蛋白质包含四个功能未知的结构域,存在于利什曼原虫中。ARM是指抗锑标记物。这项研究的主要目的是评估模型预测的准确性,从而提供对这些发现背后的复杂性和支持指标的见解。该分析还扩展到从其他物种和生物体获得的预测的比较。值得注意的是,这些蛋白质中的一种与克氏锥虫和布鲁氏锥虫具有直系同源物,对我们的分析有进一步的意义。这一尝试强调了评估深度学习模型的不同输出的重要性。促进不同生物体和蛋白质之间的比较。这在没有先前结构信息可用的情况下变得特别相关。
公众号