motif

主题
  • 文章类型: Journal Article
    背景:解码人类基因组序列需要对DNA序列功能性进行全面分析。通过计算和实验方法,研究人员已经研究了基因型与表型的关系,并生成了有助于解开复杂遗传蓝图的重要数据集。因此,最近开发的人工智能方法可以用来解释这些DNA序列的功能。
    方法:本研究探讨了深度学习的使用,特别是预训练的基因组模型,如DNA_bert_6和human_gpt2-v1,在解释和表示人类基因组序列。最初,我们精心构建了多个连接基因型和表型的数据集,以微调这些模型,从而实现精确的DNA序列分类.此外,我们评估了序列长度对分类结果的影响,并使用HERV数据集分析了模型隐藏层中特征提取的影响.为了增强我们对模型识别的表型特异性模式的理解,我们进行浓缩,具有高平均局部代表权重(ALRW)评分的人内源性逆转录病毒(HERV)序列中特定基序的致病性和保守性分析。
    结果:我们构建了多个基因型-表型数据集,与随机基因组序列相比,这些数据集显示出值得称道的分类性能,特别是在HERV数据集中,实现了二进制和多分类精度,F1值分别超过0.935和0.888。值得注意的是,HERV数据集的微调不仅提高了我们识别和区分DNA序列中不同信息类型的能力,而且还成功地在ALRW评分较高的区域中识别出与神经系统疾病和癌症相关的特定基序.随后对这些基序的分析揭示了物种对环境压力的适应性反应及其与病原体的共同进化。
    结论:这些发现突出了预先训练的基因组模型在学习DNA序列表征方面的潜力。特别是在利用HERV数据集时,并为未来的研究工作提供有价值的见解。这项研究代表了一种创新的策略,将预先训练的基因组模型表示与分析基因组序列功能的经典方法相结合。从而促进基因组学和人工智能之间的交叉受精。
    BACKGROUND: Decoding human genomic sequences requires comprehensive analysis of DNA sequence functionality. Through computational and experimental approaches, researchers have studied the genotype-phenotype relationship and generate important datasets that help unravel complicated genetic blueprints. Thus, the recently developed artificial intelligence methods can be used to interpret the functions of those DNA sequences.
    METHODS: This study explores the use of deep learning, particularly pre-trained genomic models like DNA_bert_6 and human_gpt2-v1, in interpreting and representing human genome sequences. Initially, we meticulously constructed multiple datasets linking genotypes and phenotypes to fine-tune those models for precise DNA sequence classification. Additionally, we evaluate the influence of sequence length on classification results and analyze the impact of feature extraction in the hidden layers of our model using the HERV dataset. To enhance our understanding of phenotype-specific patterns recognized by the model, we perform enrichment, pathogenicity and conservation analyzes of specific motifs in the human endogenous retrovirus (HERV) sequence with high average local representation weight (ALRW) scores.
    RESULTS: We have constructed multiple genotype-phenotype datasets displaying commendable classification performance in comparison with random genomic sequences, particularly in the HERV dataset, which achieved binary and multi-classification accuracies and F1 values exceeding 0.935 and 0.888, respectively. Notably, the fine-tuning of the HERV dataset not only improved our ability to identify and distinguish diverse information types within DNA sequences but also successfully identified specific motifs associated with neurological disorders and cancers in regions with high ALRW scores. Subsequent analysis of these motifs shed light on the adaptive responses of species to environmental pressures and their co-evolution with pathogens.
    CONCLUSIONS: These findings highlight the potential of pre-trained genomic models in learning DNA sequence representations, particularly when utilizing the HERV dataset, and provide valuable insights for future research endeavors. This study represents an innovative strategy that combines pre-trained genomic model representations with classical methods for analyzing the functionality of genome sequences, thereby promoting cross-fertilization between genomics and artificial intelligence.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article

    进行计算研究以调查SARS-CoV-2感染中内体和细胞表面受体的未知状态。研究了Toll样受体(TLRs)-4/7/8/9或ACE2受体与不同SARS-CoV-2变体之间的相互作用。
    在不同变体中分析TLR7、TLR8的RNA基序和TLR9的CpG基序。进行分子对接和分子动力学(MD)模拟以研究受体-配体相互作用。
    Alpha中TLR7/8/9识别的基序数量,Delta和伊朗变体低于野生型(WT)。对接分析显示,阿尔法,Delta和一些伊朗刺突变体对ACE2和TLR4的亲和力高于WT,这可能是他们更高的传输速率的原因。MD模拟还显示了变体和WT之间的稳定性和结构大小的差异,表明病毒载量的潜在变化。
    由于Alpha和一些伊朗分离株具有较高的传播性和快速传播性,因此似乎是值得关注的变种。Delta突变体也是一个值得关注的变种,不仅因为它与ACE2的相互作用更紧密,而且与TLR4的相互作用也更紧密。我们的结果强调ACE2和TLR4的重要性,而不是内体TLRs,介导不同病毒突变的影响,并提出其潜在的治疗应用。

    UNASSIGNED: Computational studies were performed to investigate the unknown status of endosomal and cell surface receptors in SARS-CoV-2 infection. The interactions between Toll-like receptors (TLRs)- 4/7/8/9 or ACE2 receptor and different SARS-CoV-2 variants were investigated.
    UNASSIGNED: The RNA motifs for TLR7, TLR8 and a CpG motif for TLR9 were analyzed in different variants. Molecular docking and molecular dynamics (MD) simulations were performed to investigate receptor-ligand interactions.
    UNASSIGNED: The number of motifs recognized by TLR7/8/9 in the Alpha, Delta and Iranian variants was lower than in the wild type (WT). Docking analysis revealed that the Alpha, Delta and some Iranian spike variants had a higher affinity for ACE2 and TLR4 than the WT, which may account for their higher transmission rate. The MD simulation also showed differences in stability and structure size between the variants and the WT, indicating potential variations in viral load.
    UNASSIGNED: It appears that Alpha and some Iranian isolates are the variants of concern due to their higher transmissibility and rapid spread. The Delta mutant is also a variant of concern, not only because of its closer interaction with ACE2, but also with TLR4. Our results emphasize the importance of ACE2 and TLR4, rather than endosomal TLRs, in mediating the effects of different viral mutations and suggest their potential therapeutic applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    FK506结合蛋白(FKBP),普遍存在于不同的物种中,其特征在于其进化保守的FK506结合域(FKBd)。在植物中,证据表明,这个基因家族在调节生长中起着不可或缺的作用,发展,以及对环境压力的反应。值得注意的是,水稻中FKBP基因的鉴定和功能的研究仍然有限。因此,本研究利用生物信息学工具鉴定了水稻中30个编码FKBP的基因。它提供了对它们染色体位置的详细分析,与拟南芥FKBP家族的进化关系,和基因结构。对这些水稻FKBP基因的启动子元件的进一步分析表明,胁迫响应元件的存在很高。在干旱和热胁迫条件下的定量PCR测定表明,这些不利条件可诱导基因OsFKBP15-2,OsFKBP15-3,OsFKBP16-3,OsFKBP18和OsFKBP42b。这些发现表明水稻FKBP基因家族在胁迫适应中具有重要作用。本研究为深入研究OsFKBP基因在水稻中的功能作用奠定了基础。
    The FK506 Binding Protein (FKBP), ubiquitously present across diverse species, is characterized by its evolutionarily conserved FK506 binding domain (FKBd). In plants, evidence suggests that this gene family plays integral roles in regulating growth, development, and responses to environmental stresses. Notably, research on the identification and functionality of FKBP genes in rice remains limited. Therefore, this study utilized bioinformatic tools to identify 30 FKBP-encoding genes in rice. It provides a detailed analysis of their chromosomal locations, evolutionary relationships with the Arabidopsis thaliana FKBP family, and gene structures. Further analysis of the promoter elements of these rice FKBP genes revealed a high presence of stress-responsive elements. Quantitative PCR assays under drought and heat stress conditions demonstrated that genes OsFKBP15-2, OsFKBP15-3, OsFKBP16-3, OsFKBP18, and OsFKBP42b are inducible by these adverse conditions. These findings suggest a significant role for the rice FKBP gene family in stress adaptation. This research establishes a critical foundation for deeper explorations of the functional roles of the OsFKBP genes in rice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Bostaurus以其对粗粒的耐受性而闻名,适应性,高温,湿度,湿度和抗病性。首先,牛被饲养为肉和奶,并确定与肉类生产相关性状相关的基因可以提高其整体生产力。这项研究的目的是确定基因组,分析进化,并探索金牛Pax基因家族的功能,为肉质牛育种提供新的分子靶标。在这项研究中,利用生物信息学技术从5个物种的基因组数据库中鉴定出44个Pax基因,表明牛科动物的亲缘关系相似。五只动物的Pax3和Pax7蛋白序列高度一致。总的来说,水牛的Pax基因对应于家畜。总之,水牛和家牛的Pax家族基因在Pax1/9,Pax2/5/8,Pax3/7和Pax4/6亚家族中的亲和力存在差异。我们认为Pax1/9对水牛和家畜的生长性状有影响。Pax3/7基因在水牛和家畜的进化中是保守的,可能是调节金牛芽孢杆菌生长的关键基因。Pax2/5/8亚族影响外套颜色,繁殖性能,和牛的产奶性能。Pax4/6亚家族对金牛座的乳脂百分比有影响。研究结果为理解进化论提供了理论依据,结构,金牛座Pax家族成员的功能特征以及分子遗传学和产肉金牛座物种的育种。
    Bos taurus is known for its tolerance of coarse grains, adaptability, high temperature, humidity, and disease resistance. Primarily, cattle are raised for their meat and milk, and pinpointing genes associated with traits relevant to meat production can enhance their overall productivity. The aim of this study was to identify the genome, analyze the evolution, and explore the function of the Pax gene family in B. taurus to provide a new molecular target for breeding in meat-quality-trait cattle. In this study, 44 Pax genes were identified from the genome database of five species using bioinformatics technology, indicating that the genetic relationships of bovids were similar. The Pax3 and Pax7 protein sequences of the five animals were highly consistent. In general, the Pax gene of the buffalo corresponds to the domestic cattle. In summary, there are differences in affinity between the Pax family genes of buffalo and domestic cattle in the Pax1/9, Pax2/5/8, Pax3/7, and Pax4/6 subfamilies. We believe that Pax1/9 has an effect on the growth traits of buffalo and domestic cattle. The Pax3/7 gene is conserved in the evolution of buffalo and domestic animals and may be a key gene regulating the growth of B. taurus. The Pax2/5/8 subfamily affects coat color, reproductive performance, and milk production performance in cattle. The Pax4/6 subfamily had an effect on the milk fat percentage of B. taurus. The results provide a theoretical basis for understanding the evolutionary, structural, and functional characteristics of the Pax family members of B. taurus and for molecular genetics and the breeding of meat-production B. taurus species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    磁感受生物学作为一个领域仍然相对模糊;与被认为能感知磁场的物种的广度相比,它仍然被研究不足。这里,我们提出了在硬骨鱼中扩展磁接收研究的理由。我们从玻璃体Kryptopterus的电磁感知基因(EPG)开始,并扩展到鉴定72个硬骨鱼,其同源蛋白包含保守的三苯丙氨酸(3F)基序。系统发育分析提供了有关EPG如何随时间演变的见解,并表明某些进化枝可能经历了由不同健身压力驱动的功能丧失。一个潜在的因素是淡水鱼的水型更有可能具有功能基序版本(FFF),和咸水鱼具有非功能性变体(FXF)。还揭示了,当将来自长尾囊瘤(B.g.)同源物的3F基序插入EPG-EPG(B.g.)时,反应(如细胞内钙增加所示)更快。这表明EPG有可能被设计为改善其响应并增加其用作特定结果的控制器的效用。
    Magnetoreceptive biology as a field remains relatively obscure; compared with the breadth of species believed to sense magnetic fields, it remains under-studied. Here, we present grounds for the expansion of magnetoreception studies among teleosts. We begin with the electromagnetic perceptive gene (EPG) from Kryptopterus vitreolus and expand to identify 72 teleosts with homologous proteins containing a conserved three-phenylalanine (3F) motif. Phylogenetic analysis provides insight as to how EPG may have evolved over time and indicates that certain clades may have experienced a loss of function driven by different fitness pressures. One potential factor is water type with freshwater fish significantly more likely to possess the functional motif version (FFF), and saltwater fish to have the non-functional variant (FXF). It was also revealed that when the 3F motif from the homologue of Brachyhypopomus gauderio (B.g.) is inserted into EPG-EPG(B.g.)-the response (as indicated by increased intracellular calcium) is faster. This indicates that EPG has the potential to be engineered to improve upon its response and increase its utility to be used as a controller for specific outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    当前的研究致力于鉴定在拟南芥幼苗中在多种多因素非生物胁迫组合下表现出与剪接因子(SF)一致表达的差异表达的选择性剪接(DAS)基因。SF作为控制基因表达时空动态的转录后机制。不同的应力包括盐浓度的变化,热,密集的光,和他们的组合。调查了表现出一致表达谱的簇,以查明表现出一致表达的DAS/SF基因对。通过严格的选择标准,与本研究中观察到的已记录的基因功能和表达模式进行比对,丝氨酸/富含精氨酸(SR)基因家族的四个成员被描述为与六个DAS基因一致表达的SF。这些受调节的SF基因包括cactin,SR1-like,SR30和SC35类。鉴定的一致表达的DAS基因编码不同的蛋白质,如26.5kDa的热休克蛋白,蛋白伴侣DnaJ,钾通道GORK,钙结合EF手家族蛋白,DEAD-boxRNA解旋酶,和1-氨基环丙烷-1-羧酸合酶6.在一致表达的DAS/SF基因对中,SR30/DEAD-boxRNA解旋酶,和SC35样/1-氨基环丙烷-1-羧酸合酶6成为有希望的候选者,需要进一步检查以确定这些SF是否协调相应DAS基因的剪接。这项研究有助于更深入地理解剪接机制对非生物胁迫的各种响应。利用这些DAS/SF关联显示出有望阐明增强育种计划的途径,这些育种计划旨在增强栽培植物免受高温和强光胁迫的能力。
    The current investigation endeavors to identify differentially expressed alternatively spliced (DAS) genes that exhibit concordant expression with splicing factors (SFs) under diverse multifactorial abiotic stress combinations in Arabidopsis seedlings. SFs serve as the post-transcriptional mechanism governing the spatiotemporal dynamics of gene expression. The different stresses encompass variations in salt concentration, heat, intensive light, and their combinations. Clusters demonstrating consistent expression profiles were surveyed to pinpoint DAS/SF gene pairs exhibiting concordant expression. Through rigorous selection criteria, which incorporate alignment with documented gene functionalities and expression patterns observed in this study, four members of the serine/arginine-rich (SR) gene family were delineated as SFs concordantly expressed with six DAS genes. These regulated SF genes encompass cactin, SR1-like, SR30, and SC35-like. The identified concordantly expressed DAS genes encode diverse proteins such as the 26.5 kDa heat shock protein, chaperone protein DnaJ, potassium channel GORK, calcium-binding EF hand family protein, DEAD-box RNA helicase, and 1-aminocyclopropane-1-carboxylate synthase 6. Among the concordantly expressed DAS/SF gene pairs, SR30/DEAD-box RNA helicase, and SC35-like/1-aminocyclopropane-1-carboxylate synthase 6 emerge as promising candidates, necessitating further examinations to ascertain whether these SFs orchestrate splicing of the respective DAS genes. This study contributes to a deeper comprehension of the varied responses of the splicing machinery to abiotic stresses. Leveraging these DAS/SF associations shows promise for elucidating avenues for augmenting breeding programs aimed at fortifying cultivated plants against heat and intensive light stresses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项工作旨在使用顺式工程方法设计合成盐诱导型启动子。设计的启动子(PS)包含用于基础水平表达的最小启动子序列和来自盐度胁迫诱导基因的启动子的上游顺式调节元件(CREs)。副本编号,垫片长度,和CRE的位置基于它们在天然启动子内的出现而手动确定。在瞬时转化的烟草叶中合成的PS启动子的初始活性曲线显示了七倍,五倍,在盐下,记者GUS活性增加了四倍,干旱,和脱落酸胁迫,分别,在24小时间隔,与组成型CaMV35S启动子相比。对稳定的拟南芥转化体中gus表达的分析表明,PS启动子在干旱或脱落酸胁迫下以24小时和48小时的间隔诱导表达增加了两倍,在盐胁迫下增加了五倍。与CaMV35S启动子相比。启动子PS在盐下表现出更高和更持续的活性,干旱,与本构CaMV35S相比,脱落酸胁迫。
    This work aimed to design a synthetic salt-inducible promoter using a cis-engineering approach. The designed promoter (PS) comprises a minimal promoter sequence for basal-level expression and upstream cis-regulatory elements (CREs) from promoters of salinity-stress-induced genes. The copy number, spacer lengths, and locations of CREs were manually determined based on their occurrence within native promoters. The initial activity profile of the synthesized PS promoter in transiently transformed N. tabacum leaves shows a seven-fold, five-fold, and four-fold increase in reporter GUS activity under salt, drought, and abscisic acid stress, respectively, at the 24-h interval, compared to the constitutive CaMV35S promoter. Analysis of gus expression in stable Arabidopsis transformants showed that the PS promoter induces over a two-fold increase in expression under drought or abscisic acid stress and a five-fold increase under salt stress at 24- and 48-h intervals, compared to the CaMV35S promoter. The promoter PS exhibits higher and more sustained activity under salt, drought, and abscisic acid stress compared to the constitutive CaMV35S.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    许多核糖核酸(RNA)环的3D结构的特征在于高度组织的非规范相互作用网络。已经开发了多种计算方法来注释具有这些交互的结构或自动识别循环交互网络。相比之下,相反的问题,旨在从其序列或整体的相互作用中检索几何形状的外观仍然很少被探索。在这一章中,我们将描述如何使用它们的非规范相互作用的底层网络来检索和构建保守结构基序的家族。然后,我们将展示如何将序列比对分配给这些家族,并使用BayesPairing软件建立结构基序及其相关序列比对的统计模型。从这个模型来看,我们将应用BayesPairing在新序列中识别那些循环几何形状可能发生的区域。
    The 3D structures of many ribonucleic acid (RNA) loops are characterized by highly organized networks of non-canonical interactions. Multiple computational methods have been developed to annotate structures with those interactions or automatically identify recurrent interaction networks. By contrast, the reverse problem that aims to retrieve the geometry of a look from its sequence or ensemble of interactions remains much less explored. In this chapter, we will describe how to retrieve and build families of conserved structural motifs using their underlying network of non-canonical interactions. Then, we will show how to assign sequence alignments to those families and use the software BayesPairing to build statistical models of structural motifs with their associated sequence alignments. From this model, we will apply BayesPairing to identify in new sequences regions where those loop geometries can occur.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    细胞微小RNA(miRNA)可以选择性地分泌或保留,为它们在调节人类健康和疾病方面的关键作用增加了一层。迄今为止,选择RNA结合蛋白(RBPs)已被认为是miRNA定位的潜在机制,但系统miRNA分选中RBPs的整体相关性仍不清楚。这项研究描述了表达NPY的下丘脑神经元中的细胞内和小细胞外囊泡(sEV)miRNA。这些发现得到了公开可用的白色和棕色脂肪细胞的sEV和细胞内miRNA谱的证实。内皮,肝脏,来自各种数据库的肌肉。使用实验确定的93个RBP的结合基序,我们的富集分析显示,源自sEV的miRNA包含与细胞内保留的miRNA显著不同的RBP基序。多个RBP基序在细胞类型之间共享;例如,RBM4和SAMD4在神经元中显著富集,肝细胞,骨骼肌,和内皮细胞。两种蛋白质的同源物与Argonaute1/2蛋白质物理相互作用,表明它们在miRNA分选中发挥作用。机器学习建模还表明,显著富集的RBP基序可以预测细胞特异性优先miRNA分选。在除WAT之外的所有细胞类型中使用随机森林和朴素贝叶斯对基序进行的非优化机器学习建模实现了0.77-0.84的接收器操作特征(ROC)曲线下的面积,表明高预测准确性。鉴于RBP基序具有显著的预测能力,这些结果强调了RBPs在哺乳动物细胞内miRNA分选中的关键作用,并加强了miRNA测序在优先定位中的重要性.对于小RNA疗法的未来发展,考虑到这些RBP-RNA相互作用对于最大化递送有效性和最小化脱靶效应可能是至关重要的.
    Cellular microRNAs (miRNAs) can be selectively secreted or retained, adding another layer to their critical role in regulating human health and disease. To date, select RNA-binding proteins (RBPs) have been proposed to be a mechanism underlying miRNA localization, but the overall relevance of RBPs in systematic miRNA sorting remains unclear. This study profiles intracellular and small extracellular vesicles\' (sEVs) miRNAs in NPY-expressing hypothalamic neurons. These findings were corroborated by the publicly available sEV and intracellular miRNA profiles of white and brown adipocytes, endothelium, liver, and muscle from various databases. Using experimentally determined binding motifs of 93 RBPs, our enrichment analysis revealed that sEV-originating miRNAs contained significantly different RBP motifs than those of intracellularly retained miRNAs. Multiple RBP motifs were shared across cell types; for instance, RBM4 and SAMD4 are significantly enriched in neurons, hepatocytes, skeletal muscle, and endothelial cells. Homologs of both proteins physically interact with Argonaute1/2 proteins, suggesting that they play a role in miRNA sorting. Machine learning modelling also demonstrates that significantly enriched RBP motifs could predict cell-specific preferential miRNA sorting. Non-optimized machine learning modeling of the motifs using Random Forest and Naive Bayes in all cell types except WAT achieved an area under the receiver operating characteristic (ROC) curve of 0.77-0.84, indicating a high predictive accuracy. Given that the RBP motifs have a significant predictive power, these results underscore the critical role that RBPs play in miRNA sorting within mammalian cells and reinforce the importance of miRNA sequencing in preferential localization. For the future development of small RNA therapeutics, considering these RBP-RNA interactions could be crucial to maximize delivery effectiveness and minimize off-target effects.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从单细胞RNA测序(scRNA-seq)数据推断背景特异性基因调控网络的兴趣日益增加。这涉及到在单个细胞中确定转录因子(TFs)和基因之间的调控关系,然后在特定细胞类型或细胞状态的水平上表征这些关系。在这项研究中,我们引入scGATE(单细胞基因调控门)作为一种新的计算工具,用于推断TF-基因相互作用网络,并使用scRNA-seq数据重建涉及调控TF的布尔逻辑门。与当前的布尔模型相比,scGATE消除了对每个布尔规则的单独公式和似然计算的需要(例如AND,OR,XOR).通过使用贝叶斯框架,scGATE在将模型拟合到数据后推断布尔规则,导致基于逻辑的研究的时间复杂度显着降低。我们已经使用测序(scATAC-seq)数据和TFDNA结合基序对转座酶可接近的染色质进行了测定,以过滤掉基因调控中的非相关TF。通过整合单细胞聚类和这些外部线索,scGATE能够推断上下文特定的网络。使用来自小鼠组织和人类血液的合成和真实的单细胞多组学数据来评估scGATE的性能,证明其优于现有的重建TF基因网络的工具。此外,scGATE提供了一个灵活的框架,通过推断其中的布尔逻辑门,来理解调节靶基因的TFs之间复杂的组合和合作关系。
    There is a growing interest in inferring context specific gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. This involves identifying the regulatory relationships between transcription factors (TFs) and genes in individual cells, and then characterizing these relationships at the level of specific cell types or cell states. In this study, we introduce scGATE (single-cell gene regulatory gate) as a novel computational tool for inferring TF-gene interaction networks and reconstructing Boolean logic gates involving regulatory TFs using scRNA-seq data. In contrast to current Boolean models, scGATE eliminates the need for individual formulations and likelihood calculations for each Boolean rule (e.g. AND, OR, XOR). By employing a Bayesian framework, scGATE infers the Boolean rule after fitting the model to the data, resulting in significant reductions in time-complexities for logic-based studies. We have applied assay for transposase-accessible chromatin with sequencing (scATAC-seq) data and TF DNA binding motifs to filter out non-relevant TFs in gene regulations. By integrating single-cell clustering with these external cues, scGATE is able to infer context specific networks. The performance of scGATE is evaluated using synthetic and real single-cell multi-omics data from mouse tissues and human blood, demonstrating its superiority over existing tools for reconstructing TF-gene networks. Additionally, scGATE provides a flexible framework for understanding the complex combinatorial and cooperative relationships among TFs regulating target genes by inferring Boolean logic gates among them.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号