在快速发展的计算生物学领域,准确预测蛋白质二级结构对于理解蛋白质功能至关重要,促进药物发现,推进疾病诊断。在本文中,我们提议MFTrans,基于深度学习的多特征融合网络,旨在提高蛋白质二级结构预测(PSSP)的精度和效率。该模型采用多序列比对(MSA)转换器与多视图深度学习架构相结合,以有效捕获蛋白质序列的全局和局部特征。MFTrans整合了蛋白质序列产生的不同特征,包括MSA,序列信息,进化信息,和隐藏的状态信息,采用多特征融合策略。MSA转换器用于在输入MSA中交错注意行和列,同时引入了Transformer编码器和解码器来增强提取的高级特征。混合网络架构,将卷积神经网络与双向门控递归单元(BiGRU)网络相结合,用于在特征融合后进一步提取高级特征。在独立测试中,我们的实验结果表明,MFTrans具有优越的泛化能力,在包括CASP12,CASP13,CASP14,TEST2016,TEST2018和CB513在内的公共基准上,平均表现优于其他最先进的PSSP模型3%。案例研究进一步强调了其在预测突变位点方面的先进性能。MFTrans为蛋白质科学领域做出了重要贡献,为药物发现开辟新的途径,疾病诊断,和蛋白质。
In the rapidly evolving field of computational biology, accurate prediction of protein secondary structures is crucial for understanding protein functions, facilitating drug discovery, and advancing disease diagnostics. In this paper, we propose MFTrans, a deep learning-based multi-feature fusion network aimed at enhancing the precision and efficiency of Protein Secondary Structure Prediction (PSSP). This model employs a Multiple Sequence Alignment (MSA) Transformer in combination with a multi-view deep learning architecture to effectively capture both global and local features of protein sequences. MFTrans integrates diverse features generated by protein sequences, including MSA, sequence information, evolutionary information, and hidden state information, using a multi-feature fusion strategy. The MSA Transformer is utilized to interleave row and column attention across the input MSA, while a Transformer encoder and decoder are introduced to enhance the extracted high-level features. A hybrid network architecture, combining a convolutional neural network with a bidirectional Gated Recurrent Unit (BiGRU) network, is used to further extract high-level features after feature fusion. In independent tests, our experimental results show that MFTrans has superior generalization ability, outperforming other state-of-the-art PSSP models by 3 % on average on public benchmarks including CASP12, CASP13, CASP14, TEST2016, TEST2018, and CB513. Case studies further highlight its advanced performance in predicting mutation sites. MFTrans contributes significantly to the protein science field, opening new avenues for drug discovery, disease diagnosis, and protein.