关键词: Disease ontology Diseases Drug targets Illuminating the druggable genome (IDG) Named entity recognition Proteins T_dark Target development level (TDL) Text mining Understudied protein

Mesh : Humans User-Computer Interface Natural Language Processing PubMed Software

来  源:   DOI:10.7717/peerj.17470   PDF(Pubmed)

Abstract:
TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X\'s predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user\'s web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.
摘要:
TIN-X(目标重要性和新颖性eXprerer)是一种交互式可视化工具,用于阐明疾病与潜在药物靶标之间的关联,可在newdrugtargets.org上公开获得。TIN-X使用自然语言处理来识别PubMed内容中的疾病和蛋白质提及,使用先前发布的用于基因/蛋白质和疾病名称的命名实体识别(NER)的工具。从目标中央资源数据库(TCRD)获得目标数据。两个重要指标,新颖性和重要性,是从这些数据计算的,当绘制为对数(重要性)与日志(新颖性),帮助用户在视觉上探索药物靶标的新颖性及其对疾病的重要性。TIN-X版本3.0已通过扩展的数据集进行了显着改进,包括RESTAPI的现代化架构,和改进的用户界面(UI)。数据集已经扩展到不仅包括PubMed出版物标题和摘要,还有全文文章。这导致与TIN-X的先前版本相比大约9倍更多的靶标/疾病关联。此外,包含此扩展数据集的TIN-X数据库现在通过AmazonRDS托管在云中。最近对UI的增强侧重于使用户更直观地找到感兴趣的疾病或药物目标,同时提供新的,可排序的表视图模式与现有的绘图视图模式相伴。UI改进还帮助用户浏览相关的PubMed出版物,以探索和理解TIN-X预测的特定疾病与感兴趣的目标之间的关联的基础。在实施这些升级时,在Web服务器和用户的Web浏览器之间平衡计算资源,以在容纳扩展数据集的同时实现足够的性能。一起,这些进展旨在延长用户可以从TIN-X中受益的持续时间,同时提供扩展的数据集和研究人员可以用来更好地阐明未被研究的蛋白质的新功能。
公众号