关键词: Chromosome aberration Connection index Machine learning QSAR Similarity return search

Mesh : Algorithms Artificial Intelligence Chromosome Aberrations Entropy Humans Quantitative Structure-Activity Relationship

来  源:   DOI:10.1016/j.compbiomed.2022.105573

Abstract:
Chromosome aberration (CA) is a serious genotoxicity of a compound, leading to carcinogenicity and developmental side effects. In the present manuscript, we developed a QSAR model for CA prediction using artificial intelligence methodologies. The reliable QSAR model was constructed based on an enlarged data set of 3208 compounds by optimizing machine learning and deep learning algorithms based on hyperparametric iterations and using multiple descriptors of molecular fingerprint in combination with drug-like molecular properties (MP) screened by entropy weight methodology on the open-source Python platform. Furthermore, molecular similarity for returning search and molecular connection index for additional descriptor were additionally introduced to differentiate the compounds with high similarity for correct CA prediction for QSAR model generation. The final generated CA-(Q)SAR model exhibited good prediction accuracy of 80.6%. The bias of the final model is about 0.9793. On the basis of generated QSAR model, data analyses were further performed to analyze the typical structure features in numerical intervals (MPI) of molecular properties MW, XlogP, and TPSA, respectively, for potential CA or non-CA toxicity with a normalized occurrence probability (NOP) more than 70%, which may provide useful clues for drug design of leads or candidate devoid of CA genotoxicity.
摘要:
染色体畸变(CA)是一种严重的基因毒性化合物,导致致癌性和发育副作用。在目前的手稿中,我们使用人工智能方法开发了用于CA预测的QSAR模型。通过基于超参数迭代优化机器学习和深度学习算法,并结合熵权方法在开源Python平台上筛选出的药物样分子性质(MP),使用分子指纹的多个描述符,基于3208个化合物的放大数据集构建了可靠的QSAR模型。此外,另外引入了用于返回搜索的分子相似性和用于其他描述符的分子连接指数,以区分具有高相似性的化合物,以正确预测QSAR模型生成的CA。最终生成的CA-(Q)SAR模型具有80.6%的良好预测精度。最终模型的偏差约为0.9793。在生成的QSAR模型的基础上,进一步进行数据分析,以分析分子性质MW的数值间隔(MPI)中的典型结构特征,XlogP,TPSA,分别,对于标准化发生概率(NOP)超过70%的潜在CA或非CA毒性,这可能为没有CA遗传毒性的前导或候选药物的设计提供有用的线索。
公众号