关键词: European Health Data Space advanced analytics artificial intelligence data-driven healthcare interoperability machine learning privacy-preservation record linkage

来  源:   DOI:10.3389/fmed.2024.1301660   PDF(Pubmed)

Abstract:
UNASSIGNED: The potential for secondary use of health data to improve healthcare is currently not fully exploited. Health data is largely kept in isolated data silos and key infrastructure to aggregate these silos into standardized bodies of knowledge is underdeveloped. We describe the development, implementation, and evaluation of a federated infrastructure to facilitate versatile secondary use of health data based on Health Data Space nodes.
UNASSIGNED: Our proposed nodes are self-contained units that digest data through an extract-transform-load framework that pseudonymizes and links data with privacy-preserving record linkage and harmonizes into a common data model (OMOP CDM). To support collaborative analyses a multi-level feature store is also implemented. A feasibility experiment was conducted to test the infrastructures potential for machine learning operations and deployment of other apps (e.g., visualization). Nodes can be operated in a network at different levels of sharing according to the level of trust within the network.
UNASSIGNED: In a proof-of-concept study, a privacy-preserving registry for heart failure patients has been implemented as a real-world showcase for Health Data Space nodes at the highest trust level, linking multiple data sources including (a) electronical medical records from hospitals, (b) patient data from a telemonitoring system, and (c) data from Austria\'s national register of deaths. The registry is deployed at the tirol kliniken, a hospital carrier in the Austrian state of Tyrol, and currently includes 5,004 patients, with over 2.9 million measurements, over 574,000 observations, more than 63,000 clinical free text notes, and in total over 5.2 million data points. Data curation and harmonization processes are executed semi-automatically at each individual node according to data sharing policies to ensure data sovereignty, scalability, and privacy. As a feasibility test, a natural language processing model for classification of clinical notes was deployed and tested.
UNASSIGNED: The presented Health Data Space node infrastructure has proven to be practicable in a real-world implementation in a live and productive registry for heart failure. The present work was inspired by the European Health Data Space initiative and its spirit to interconnect health data silos for versatile secondary use of health data.
摘要:
二次使用健康数据来改善医疗保健的潜力目前尚未得到充分利用。健康数据主要保存在孤立的数据孤岛中,而将这些孤岛聚合为标准化知识体系的关键基础设施尚不发达。我们描述了发展,实施,和评估联合基础设施,以促进基于健康数据空间节点的健康数据的通用二次使用。
我们提出的节点是自包含的单元,通过提取-转换-加载框架来消化数据,该框架将数据与隐私保护记录链接进行假名和链接,并协调成通用数据模型(OMOPCDM)。为了支持协作分析,还实现了多级功能存储。进行了可行性实验,以测试机器学习操作和其他应用程序部署的基础架构潜力(例如,可视化)。节点可以根据网络内的信任级别在网络中以不同的共享级别操作。
在概念验证研究中,针对心力衰竭患者的隐私保护注册表已被实施为最高信任级别的健康数据空间节点的真实展示,链接多个数据源,包括(A)来自医院的电子病历,(b)来自远程监测系统的患者数据,和(C)来自奥地利国家死亡登记册的数据。注册表部署在tirolkliniken,奥地利蒂罗尔州的一家医院,目前包括5,004名患者,超过290万次测量,超过574,000个观察,超过6.3万份临床免费文本笔记,总计超过520万个数据点。根据数据共享策略,在每个节点上半自动执行数据策展和协调过程,以确保数据主权,可扩展性,和隐私。作为可行性测试,部署并测试了用于临床笔记分类的自然语言处理模型。
所提出的健康数据空间节点基础设施已被证明在针对心力衰竭的实时和高效注册的现实世界实施中是可行的。目前的工作受到了欧洲卫生数据空间倡议及其精神的启发,该精神旨在将卫生数据孤岛互连起来,以实现卫生数据的通用二次使用。
公众号