关键词: Biocomputational method Biological constraints Molecular network Neural networks Sequence analysis

来  源:   DOI:10.1016/j.isci.2024.110371   PDF(Pubmed)

Abstract:
Ab initio computational reconstructions of protein-protein interaction (PPI) networks will provide invaluable insights into cellular systems, enabling the discovery of novel molecular interactions and elucidating biological mechanisms within and between organisms. Leveraging the latest generation protein language models and recurrent neural networks, we present SENSE-PPI, a sequence-based deep learning model that efficiently reconstructs ab initio PPIs, distinguishing partners among tens of thousands of proteins and identifying specific interactions within functionally similar proteins. SENSE-PPI demonstrates high accuracy, limited training requirements, and versatility in cross-species predictions, even with non-model organisms and human-virus interactions. Its performance decreases for phylogenetically more distant model and non-model organisms, but signal alteration is very slow. In this regard, it demonstrates the important role of parameters in protein language models. SENSE-PPI is very fast and can test 10,000 proteins against themselves in a matter of hours, enabling the reconstruction of genome-wide proteomes.
摘要:
蛋白质-蛋白质相互作用(PPI)网络的从头计算重建将为细胞系统提供宝贵的见解,能够发现新的分子相互作用,并阐明生物体内和生物体之间的生物学机制。利用最新一代的蛋白质语言模型和递归神经网络,我们提出了SENSE-PPI,基于序列的深度学习模型,可有效地从头算重建PPI,在数万种蛋白质中区分伴侣,并识别功能相似蛋白质中的特定相互作用。SENSE-PPI表现出高精度,有限的培训要求,以及跨物种预测的多功能性,即使是非模式生物和人类病毒相互作用。对于系统发育上更远的模型和非模型生物,其性能会降低,但是信号改变非常缓慢。在这方面,它证明了参数在蛋白质语言模型中的重要作用。SENSE-PPI非常快,可以在几个小时内测试10,000种蛋白质。能够重建全基因组的蛋白质组。
公众号