关键词: CNN drug resistance enhancer epigenetics histone mark machine learning

Mesh : Histone Code Humans Histones / metabolism genetics Chromatin / metabolism genetics Epigenesis, Genetic Neural Networks, Computer Computational Biology / methods Gene Expression Regulation

来  源:   DOI:10.1093/bib/bbae373   PDF(Pubmed)

Abstract:
Histone modifications, known as histone marks, are pivotal in regulating gene expression within cells. The vast array of potential combinations of histone marks presents a considerable challenge in decoding the regulatory mechanisms solely through biological experimental approaches. To overcome this challenge, we have developed a method called CatLearning. It utilizes a modified convolutional neural network architecture with a specialized adaptation Residual Network to quantitatively interpret histone marks and predict gene expression. This architecture integrates long-range histone information up to 500Kb and learns chromatin interaction features without 3D information. By using only one histone mark, CatLearning achieves a high level of accuracy. Furthermore, CatLearning predicts gene expression by simulating changes in histone modifications at enhancers and throughout the genome. These findings help comprehend the architecture of histone marks and develop diagnostic and therapeutic targets for diseases with epigenetic changes.
摘要:
组蛋白修改,被称为组蛋白标记,是调节细胞内基因表达的关键。组蛋白标记的大量潜在组合在仅通过生物学实验方法解码调控机制方面提出了相当大的挑战。为了克服这一挑战,我们开发了一种叫做CatLearning的方法。它利用具有专门适应残差网络的改进的卷积神经网络架构来定量解释组蛋白标记并预测基因表达。该架构集成了高达500Kb的远程组蛋白信息,并在没有3D信息的情况下学习染色质相互作用特征。通过只使用一个组蛋白标记,CatLearning实现了高水平的准确性。此外,CatLearning通过模拟增强子和整个基因组的组蛋白修饰变化来预测基因表达。这些发现有助于理解组蛋白标记的结构,并为具有表观遗传变化的疾病开发诊断和治疗靶标。
公众号