关键词: CDK8 fragment machine learning molecular docking structure‐based virtual screening

Mesh : Humans Cyclin-Dependent Kinase 8 / antagonists & inhibitors chemistry metabolism Drug Evaluation, Preclinical / methods Machine Learning Molecular Docking Simulation Protein Kinase Inhibitors / chemistry pharmacology Small Molecule Libraries / chemistry pharmacology

来  源:   DOI:10.1002/pro.5007   PDF(Pubmed)

Abstract:
The identification of an effective inhibitor is an important starting step in drug development. Unfortunately, many issues such as the characterization of protein binding sites, the screening library, materials for assays, etc., make drug screening a difficult proposition. As the size of screening libraries increases, more resources will be inefficiently consumed. Thus, new strategies are needed to preprocess and focus a screening library towards a targeted protein. Herein, we report an ensemble machine learning (ML) model to generate a CDK8-focused screening library. The ensemble model consists of six different algorithms optimized for CDK8 inhibitor classification. The models were trained using a CDK8-specific fragment library along with molecules containing CDK8 activity. The optimized ensemble model processed a commercial library containing 1.6 million molecules. This resulted in a CDK8-focused screening library containing 1,672 molecules, a reduction of more than 99.90%. The CDK8-focused library was then subjected to molecular docking, and 25 candidate compounds were selected. Enzymatic assays confirmed six CDK8 inhibitors, with one compound producing an IC50 value of ≤100 nM. Analysis of the ensemble ML model reveals the role of the CDK8 fragment library during training. Structural analysis of molecules reveals the hit compounds to be structurally novel CDK8 inhibitors. Together, the results highlight a pipeline for curating a focused library for a specific protein target, such as CDK8.
摘要:
有效抑制剂的鉴定是药物开发中的重要起始步骤。不幸的是,许多问题,如蛋白质结合位点的表征,筛选库,用于化验的材料,等。,使药物筛选成为一个困难的命题。随着筛选文库规模的增加,更多的资源将被低效消耗。因此,需要新的策略来预处理筛选文库,并将其集中于靶向蛋白.在这里,我们报告了集成机器学习(ML)模型,以生成以CDK8为中心的筛查库.集成模型由针对CDK8抑制剂分类优化的六种不同算法组成。使用CDK8特异性片段文库以及含有CDK8活性的分子训练模型。优化的集成模型处理了含有160万个分子的商业文库。这导致了一个包含1,672个分子的CDK8集中的筛选文库,减少超过99.90%。然后对CDK8聚焦文库进行分子对接,选择25个候选化合物。酶分析证实了六种CDK8抑制剂,一种化合物的IC50值≤100nM。集成ML模型的分析揭示了CDK8片段库在训练期间的作用。分子的结构分析揭示了命中化合物是结构新颖的CDK8抑制剂。一起,结果突出显示了为特定蛋白质靶标策划聚焦文库的管道,例如CDK8。
公众号