关键词: Caenorhabditis Drosophila Ecdysozoa Eukaryota Haemonchus essential genes machine learning prediction/prioritisation

Mesh : Animals Machine Learning Haemonchus / genetics Genes, Essential Caenorhabditis elegans / genetics Helminth Proteins / genetics metabolism Computational Biology / methods Drosophila melanogaster / genetics

来  源:   DOI:10.3390/ijms25137015   PDF(Pubmed)

Abstract:
Over the years, comprehensive explorations of the model organisms Caenorhabditis elegans (elegant worm) and Drosophila melanogaster (vinegar fly) have contributed substantially to our understanding of complex biological processes and pathways in multicellular organisms generally. Extensive functional genomic-phenomic, genomic, transcriptomic, and proteomic data sets have enabled the discovery and characterisation of genes that are crucial for life, called \'essential genes\'. Recently, we investigated the feasibility of inferring essential genes from such data sets using advanced bioinformatics and showed that a machine learning (ML)-based workflow could be used to extract or engineer features from DNA, RNA, protein, and/or cellular data/information to underpin the reliable prediction of essential genes both within and between C. elegans and D. melanogaster. As these are two distantly related species within the Ecdysozoa, we proposed that this ML approach would be particularly well suited for species that are within the same phylum or evolutionary clade. In the present study, we cross-predicted essential genes within the phylum Nematoda (evolutionary clade V)-between C. elegans and the pathogenic parasitic nematode H. contortus-and then ranked and prioritised H. contortus proteins encoded by these genes as intervention (e.g., drug) target candidates. Using strong, validated predictors, we inferred essential genes of H. contortus that are involved predominantly in crucial biological processes/pathways including ribosome biogenesis, translation, RNA binding/processing, and signalling and which are highly transcribed in the germline, somatic gonad precursors, sex myoblasts, vulva cell precursors, various nerve cells, glia, or hypodermis. The findings indicate that this in silico workflow provides a promising avenue to identify and prioritise panels/groups of drug target candidates in parasitic nematodes for experimental validation in vitro and/or in vivo.
摘要:
多年来,模型生物的全面探索秀丽隐杆线虫(优雅的蠕虫)和果蝇(醋蝇)为我们对多细胞生物中复杂的生物学过程和途径的理解做出了重大贡献。广泛的功能基因组-表型,基因组,转录组,和蛋白质组数据集使得能够发现和表征对生命至关重要的基因,叫做“必需基因”。最近,我们研究了使用先进的生物信息学从这些数据集中推断必需基因的可行性,并表明基于机器学习(ML)的工作流程可用于从DNA中提取或设计特征。RNA,蛋白质,和/或细胞数据/信息,以支持线虫和D.melanogaster内部和之间必需基因的可靠预测。由于这些是Ecdysozoa中两个远亲的物种,我们提出,这种ML方法将特别适用于同一门或进化枝的物种。在本研究中,我们交叉预测线虫门(进化进化枝V)内的必需基因-秀丽隐杆线虫和致病性寄生线虫H.contortus-然后对这些基因编码的H.contortus蛋白进行排序和优先排序(例如,药物)目标候选人。使用强,已验证的预测因子,我们推断H.contortus的必需基因主要参与关键的生物过程/途径,包括核糖体生物发生,翻译,RNA结合/加工,和信号,在种系中高度转录,体细胞性腺前体,性成肌细胞,外阴细胞前体,各种神经细胞,glia,或皮下组织。研究结果表明,这种计算机工作流程提供了一个有希望的途径,可以识别和优先考虑寄生线虫中药物靶标候选物组/组,以进行体外和/或体内实验验证。
公众号