背景:识别增强子-启动子相互作用(EPI)对于人类发育至关重要。基因组中的EPI在调节转录中起关键作用。然而,分类EPI的实验方法在努力方面过于昂贵,时间,和资源。因此,越来越多的研究正在开发计算技术,特别是使用深度学习和其他机器学习技术,来解决这些问题。不幸的是,当前的大多数计算方法基于卷积神经网络,递归神经网络,或者它们的组合,它们不考虑上下文细节以及增强子和启动子序列之间的长程相互作用。为了克服上述限制,本研究提出了一种称为EPI-Trans的基于变压器的新模型。变换器模型中的多头注意力机制自动学习代表增强子和启动子序列之间长相互关系的特征。此外,创建具有可转移性的通用模型,该模型可用作各种细胞系的预训练模型。此外,使用特定细胞系数据集对通用模型的参数进行微调以提高性能。
结果:根据从六个基准细胞系获得的结果,特定的平均AUROC,泛型,最好的型号是94.2%,95%,和95.7%,而平均AUPR为80.5%,66.1%,分别为79.6%。
结论:本研究提出了一种基于变压器的EPI预测深度学习模型。某些细胞系的比较结果表明,EPI-Trans优于其他尖端技术,可以在识别EPI的挑战中提供卓越的性能。
BACKGROUND: Recognition of
enhancer-promoter Interactions (EPIs) is crucial for human development. EPIs in the genome play a key role in regulating transcription. However, experimental approaches for classifying EPIs are too expensive in terms of effort, time, and resources. Therefore, more and more studies are being done on developing computational techniques, particularly using deep learning and other machine learning techniques, to address such problems. Unfortunately, the majority of current computational methods are based on convolutional neural networks, recurrent neural networks, or a combination of them, which don\'t take into consideration contextual details and the long-range interactions between the
enhancer and promoter sequences. A new transformer-based model called EPI-Trans is presented in this study to overcome the aforementioned limitations. The multi-head attention mechanism in the transformer model automatically learns features that represent the long interrelationships between
enhancer and promoter sequences. Furthermore, a generic model is created with transferability that can be utilized as a pre-trained model for various cell lines. Moreover, the parameters of the generic model are fine-tuned using a particular cell line dataset to improve performance.
RESULTS: Based on the results obtained from six benchmark cell lines, the average AUROC for the specific, generic, and best models is 94.2%, 95%, and 95.7%, while the average AUPR is 80.5%, 66.1%, and 79.6% respectively.
CONCLUSIONS: This study proposed a transformer-based deep learning model for EPI prediction. The comparative results on certain cell lines show that EPI-Trans outperforms other cutting-edge techniques and can provide superior performance on the challenge of recognizing EPI.