METHODS: Large-scale text mining from medical literature was utilized to construct a conceptual network based on the Semantic MEDLINE Database (SemMedDB). SemMedDB is a PubMed-scale repository of the \"concept-relation-concept\" triple format. Relations between concepts are categorized as Excitatory, Inhibitory, or General.
RESULTS: To facilitate the use of large-scale triple sets in SemMedDB, we have developed a computable biomedical knowledge (CBK) system (https://cbk.bjmu.edu.cn/), a website that enables direct retrieval of related publications and their corresponding triples without the necessity of writing SQL statements. Three case studies were elaborated to demonstrate the applications of the CBK system.
CONCLUSIONS: The CBK system is openly available and user-friendly for rapidly capturing a set of influencing factors for a phenotype and building candidate DAGs between exposure-outcome variables. It could be a valuable tool to reduce the exploration time in considering relationships between variables, and constructing a DAG. A reliable and standardized DAG could significantly improve the design and interpretation of observational health research.
方法:利用医学文献中的大规模文本挖掘来构建基于语义MEDLINE数据库(SemMedDB)的概念网络。SemMedDB是“概念-关系-概念”三元组格式的PubMed规模存储库。概念之间的关系被归类为兴奋,抑制性,或将军。
结果:为了便于在SemMedDB中使用大规模三元组,我们开发了一个可计算的生物医学知识(CBK)系统(https://cbk。bjmu.edu.cn/),一个网站,可以直接检索相关出版物及其相应的三元组,而无需编写SQL语句。阐述了三个案例研究来展示CBK系统的应用。
结论:CBK系统是公开可用且用户友好的,可以快速捕获一组表型的影响因素,并在暴露-结果变量之间建立候选DAG。这可能是一个有价值的工具,可以减少考虑变量之间关系的探索时间,构建DAG。可靠和标准化的DAG可以显着改善观察性健康研究的设计和解释。