关键词: artificial intelligence behaviour change interventions evidence synthesis information extractions machine learning natural language processing ontologies prediction systems

来  源:   DOI:10.12688/wellcomeopenres.20000.1   PDF(Pubmed)

Abstract:
Background  Using reports of randomised trials of smoking cessation interventions as a test case, this study aimed to develop and evaluate machine learning (ML) algorithms for extracting information from study reports and predicting outcomes as part of the Human Behaviour-Change Project. It is the first of two linked papers, with the second paper reporting on further development of a prediction system. Methods  Researchers manually annotated 70 items of information (\'entities\') in 512 reports of randomised trials of smoking cessation interventions covering intervention content and delivery, population, setting, outcome and study methodology using the Behaviour Change Intervention Ontology. These entities were used to train ML algorithms to extract the information automatically. The information extraction ML algorithm involved a named-entity recognition system using the \'FLAIR\' framework. The manually annotated intervention, population, setting and study entities were used to develop a deep-learning algorithm using multiple layers of long-short-term-memory (LSTM) components to predict smoking cessation outcomes. Results  The F1 evaluation score, derived from the false positive and false negative rates (range 0-1), for the information extraction algorithm averaged 0.42 across different types of entity (SD=0.22, range 0.05-0.88) compared with an average human annotator\'s score of 0.75 (SD=0.15, range 0.38-1.00). The algorithm for assigning entities to study arms ( e.g., intervention or control) was not successful. This initial ML outcome prediction algorithm did not outperform prediction based just on the mean outcome value or a linear regression model. Conclusions  While some success was achieved in using ML to extract information from reports of randomised trials of smoking cessation interventions, we identified major challenges that could be addressed by greater standardisation in the way that studies are reported. Outcome prediction from smoking cessation studies may benefit from development of novel algorithms, e.g., using ontological information to inform ML (as reported in the linked paper 3).
摘要:
背景使用戒烟干预措施的随机试验报告作为测试案例,这项研究旨在开发和评估机器学习(ML)算法,用于从研究报告中提取信息并预测结果,作为人类行为改变项目的一部分。这是两篇相关论文中的第一篇,第二篇论文报告了预测系统的进一步发展。方法研究人员在512份戒烟干预措施的随机试验报告中手动注释70项信息(“实体”),涵盖干预内容和交付。人口,设置,使用行为改变干预本体论的结果和研究方法。这些实体用于训练ML算法以自动提取信息。信息提取ML算法涉及使用\'FLAIR\'框架的命名实体识别系统。手动注释的干预,人口,设定和研究实体被用于开发一种深度学习算法,该算法使用多层长短期记忆(LSTM)组件来预测戒烟结果.结果F1评价得分,来自假阳性和假阴性率(范围0-1),对于信息提取算法,不同类型实体的平均分数为0.42(SD=0.22,范围0.05-0.88),而人类注释者的平均分数为0.75(SD=0.15,范围0.38-1.00)。分配实体以研究手臂的算法(例如,干预或控制)不成功。这种初始ML结果预测算法并没有超过仅基于平均结果值或线性回归模型的预测。结论虽然使用ML从戒烟干预的随机试验报告中提取信息取得了一些成功,我们确定了主要的挑战,这些挑战可以通过报告研究的方式实现更高的标准化来解决.戒烟研究的结果预测可能受益于新算法的开发,例如,使用本体论信息通知ML(如链接论文3中报道的)。
公众号