目的:确定医疗实体之间的新关系,比如毒品,疾病,和副作用,通常是资源密集型任务,涉及实验和临床试验。相关数据和精选知识的可用性增加,使这项任务的计算方法,特别是通过训练模型来预测可能的关系。这样的模型依赖于被研究的医疗实体的有意义的表示。我们提出了一种通用特征向量表示,它利用了医学术语的共同出现,与PubMed引文链接。
方法:我们通过推断两种类型的关系来证明所提出的表述的有用性:一种药物引起副作用,一种药物治疗适应症。为了预测这些关系并评估其有效性,我们应用了两种建模方法:使用神经网络的多任务建模和基于梯度提升机和逻辑回归的单任务建模。
结果:这些训练的模型,预测副作用或适应症,与使用单个直接共现特征的基线模型相比,获得了明显更好的结果。成果显示了综合表示的优势。
结论:选择合适的表示形式对机器学习模型的预测性能具有巨大影响。我们提议的代表权很强大,因为它跨越多个医学领域,可用于预测广泛的关系类型。
结论:各种医疗实体之间新关系的发现可以转化为有意义的见解,例如,与药物开发或疾病理解有关。我们对医疗实体的表示可以用来训练预测这种关系的模型,从而加速与医疗保健相关的发现。
OBJECTIVE: Identifying new relations between medical entities, such as drugs, diseases, and side effects, is typically a resource-intensive task, involving experimentation and clinical trials. The increased availability of related data and curated knowledge enables a computational approach to this task, notably by training models to predict likely relations. Such models rely on meaningful representations of the medical entities being studied. We propose a generic features vector representation that leverages co-occurrences of medical terms, linked with PubMed citations.
METHODS: We demonstrate the usefulness of the proposed representation by inferring two types of relations: a drug causes a side effect and a drug treats an indication. To predict these relations and assess their effectiveness, we applied 2 modeling approaches: multi-task modeling using neural networks and single-task modeling based on gradient boosting machines and logistic regression.
RESULTS: These trained models, which predict either side effects or indications, obtained significantly better results than baseline models that use a single direct co-occurrence feature. The results demonstrate the advantage of a comprehensive representation.
CONCLUSIONS: Selecting the appropriate representation has an immense impact on the predictive performance of machine learning models. Our proposed representation is powerful, as it spans multiple medical domains and can be used to predict a wide range of relation types.
CONCLUSIONS: The discovery of new relations between various medical entities can be translated into meaningful insights, for example, related to drug development or disease understanding. Our representation of medical entities can be used to train models that predict such relations, thus accelerating healthcare-related discoveries.