data science

数据科学
  • 文章类型: Journal Article
    对电子健康记录(EHR)和数据类型(即,诊断,药物,和实验室数据)要求评估其数据质量作为一种基本方法,特别是由于需要确定患有慢性病的适当分母人群,例如2型糖尿病(T2D),使用通常可用的可计算表型定义(即,表型)。
    为了弥合这一差距,我们的研究旨在评估表型中的EHR数据质量和变异以及稳健性(或缺乏)问题如何对分母群体的识别产生潜在影响.
    大约208,000名T2D患者被纳入我们的研究,该研究使用了约翰·霍普金斯大学医疗机构(JHMI)2017-2019年的回顾性EHR数据。我们的评估包括4个已发表的表型和1个来自Hopkins专家小组的定义。我们对人口统计进行了描述性分析(即,年龄,性别,种族,和种族),使用医疗保健(住院和急诊室就诊),和每个表型的平均Charlson合并症指数得分。然后,我们使用不同的方法来诱导或模拟完整性的数据质量问题,准确度,和时效性分别跨每个表型。对于诱发的数据不完整,我们的模型随机放弃诊断,药物,和实验室代码以10%的增量独立;对于诱导的数据不准确,我们的模型用相同数据类型的另一个代码随机替换诊断或药物代码,并在实验室结果值中从-100%到+10%引起2%的增量变化;最后,为了及时性,数据被建模为诱导的日期记录增量转移30天到365天.
    在使用EHR的所有表型中,不到四分之一(n=47,326,23%)的人口重叠。通过每种表型识别的群体在数据类型的所有组合中变化。诱发的不完整性识别出每次增加的患者较少;例如,在100%诊断不完整的情况下,慢性病数据仓库表型确定为零患者,因为其表型特征仅包括诊断代码。诱导的不准确性和及时性类似地证明了每个表型的性能变化,因此,每次增加的变化导致更少的患者被识别。
    我们使用EHR数据进行诊断,药物,和来自大型三级医院系统的实验室数据类型,以了解T2D表型差异和性能。我们使用诱导数据质量方法来了解数据质量问题如何影响临床分母群体的识别(例如,临床研究和试验,人口健康评估)和财务或运营决策。我们研究的新结果可能为未来塑造可应用于临床信息学的常见T2D可计算表型定义的方法提供信息。管理慢性病,以及整个行业在医疗保健方面的额外努力。
    UNASSIGNED: Increasing and substantial reliance on electronic health records (EHRs) and data types (ie, diagnosis, medication, and laboratory data) demands assessment of their data quality as a fundamental approach, especially since there is a need to identify appropriate denominator populations with chronic conditions, such as type 2 diabetes (T2D), using commonly available computable phenotype definitions (ie, phenotypes).
    UNASSIGNED: To bridge this gap, our study aims to assess how issues of EHR data quality and variations and robustness (or lack thereof) in phenotypes may have potential impacts in identifying denominator populations.
    UNASSIGNED: Approximately 208,000 patients with T2D were included in our study, which used retrospective EHR data from the Johns Hopkins Medical Institution (JHMI) during 2017-2019. Our assessment included 4 published phenotypes and 1 definition from a panel of experts at Hopkins. We conducted descriptive analyses of demographics (ie, age, sex, race, and ethnicity), use of health care (inpatient and emergency room visits), and the average Charlson Comorbidity Index score of each phenotype. We then used different methods to induce or simulate data quality issues of completeness, accuracy, and timeliness separately across each phenotype. For induced data incompleteness, our model randomly dropped diagnosis, medication, and laboratory codes independently at increments of 10%; for induced data inaccuracy, our model randomly replaced a diagnosis or medication code with another code of the same data type and induced 2% incremental change from -100% to +10% in laboratory result values; and lastly, for timeliness, data were modeled for induced incremental shift of date records by 30 days to 365 days.
    UNASSIGNED: Less than a quarter (n=47,326, 23%) of the population overlapped across all phenotypes using EHRs. The population identified by each phenotype varied across all combinations of data types. Induced incompleteness identified fewer patients with each increment; for example, at 100% diagnostic incompleteness, the Chronic Conditions Data Warehouse phenotype identified zero patients, as its phenotypic characteristics included only diagnosis codes. Induced inaccuracy and timeliness similarly demonstrated variations in performance of each phenotype, therefore resulting in fewer patients being identified with each incremental change.
    UNASSIGNED: We used EHR data with diagnosis, medication, and laboratory data types from a large tertiary hospital system to understand T2D phenotypic differences and performance. We used induced data quality methods to learn how data quality issues may impact identification of the denominator populations upon which clinical (eg, clinical research and trials, population health evaluations) and financial or operational decisions are made. The novel results from our study may inform future approaches to shaping a common T2D computable phenotype definition that can be applied to clinical informatics, managing chronic conditions, and additional industry-wide efforts in health care.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    数据科学作为一门新的交叉学科出现在统计学的交叉点,计算机科学,和特定领域的专业知识,比如健康和生物学。数据科学领域,与其他数据相关的职业一样,不断发展。我们进行了一项研究,检查分配给第一年实习学生的任务,攻读健康数据科学硕士学位,探索任务,采用的技术和所需的技能,通过对32名参与者的半结构化访谈,与学生的培训保持一致。四分之三的学生被安置在公共部门的团队中。在这些实体中,有11家医院和12所大学。尽管大多数学生作为方法论团队的一部分进行了实习,他们的团队中经常有医疗保健专业人员。近一半的任务涉及描述性分析,其次是9项任务,重点是病因或预测,8项任务是实施数据仓库。大多数学生必须进行数据管理并生成图表,只有一半进行了统计分析。调查结果强调,数据管理仍然是一个重大挑战,在设计培训计划时应该考虑到这一点。在未来,仍然需要确定这种趋势是否会持续到二年级学生身上,或者,有了经验,他们更经常被分配统计分析。
    Data Science emerged as a new cross-disciplinary discipline at the intersection of statistics, computer science, and expertise in a specific domain, such as health and biology. The data science field, alongside other data-related professions, is continuously evolving. We conducted a study examining tasks assigned to first-year internship students pursuing a Master\'s degree in Health Data Science, exploring the missions, technologies employed and skills required, and internship alignment with students\' training through semi-structured interviews with 32 participants. Three quarters of the students were placed in teams within the public sector. Among these entities, there were 11 hospitals and 12 universities. Although the majority of students did their internship as part of a methodological team, they often had a healthcare professional on their team. Nearly half of the missions involved descriptive analysis, followed by 9 missions focused on etiology or prediction and 8 missions on implementing a data warehouse. The majority of students had to perform data management and produce graphs, while only half conducted statistical analysis. The findings highlighted that data management remains a major challenge, and it should be taken into consideration when designing training programs. In future, it remains to determine whether this trend will continue with second-year students or if, with experience, they are more often assigned statistical analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:将电子健康记录(EHR)数据用于临床或研究目的在很大程度上取决于数据的适应性。然而,缺乏评估EHR数据适用性的标准化框架,导致数据使用项目(DUP)的质量不一致。这项研究专注于大学医学研究和护理的医学信息学(MIRACUM)数据集成中心(DIC),并研究了在德国DIC设置中评估和自动化临床数据适用性的经验实践。
    目的:该研究旨在(1)捕获并讨论MIRACUMDIC如何评估和增强观察性医疗保健数据的适用性,并检查与现有建议的一致性;(2)确定设计和实施计算机辅助解决方案以评估MIRACUMDIC中的EHR数据适用性的要求。
    方法:对MIRACUM附属的10家德国大学医院的DIC进行了开放式调查,采用了定性方法。按照归纳定性方法,使用主题分析对数据进行分析。
    结果:所有10个MIRACUMDIC都参加了,17名参与者揭示了评估数据适合度的各种方法,包括4眼原则和跨系统数据值比较等数据一致性检查。常见的做法包括与DUP相关的数据适应性反馈循环,并使用自行设计的仪表板进行监控。大多数专家都有计算机科学背景和硕士学位,表明技术熟练,但可能缺乏临床或统计专业知识。确定了计算机辅助解决方案的九个关键要求,包括灵活性,可理解性,可扩展性,和实用性。参与者使用异构数据存储库来评估数据质量标准和实际策略,以与研究和临床团队进行沟通。
    结论:该研究发现了MIRACUMDIC的当前实践与现有建议之间的差距,提供对评估和报告临床数据适合性的复杂性的见解。此外,引入了适合性评估的三方模块化框架,以简化即将实施的流程。它为跨多个地点开发和集成自动化解决方案提供了宝贵的投入。这可以包括与用于操作诸如3x3数据质量评估框架的框架的高级机器学习算法的统计比较。这些发现为未来的设计和实施研究提供了基础证据,以增强观察性医疗保健环境中特定DUP的数据质量评估。
    BACKGROUND: Leveraging electronic health record (EHR) data for clinical or research purposes heavily depends on data fitness. However, there is a lack of standardized frameworks to evaluate EHR data suitability, leading to inconsistent quality in data use projects (DUPs). This research focuses on the Medical Informatics for Research and Care in University Medicine (MIRACUM) Data Integration Centers (DICs) and examines empirical practices on assessing and automating the fitness-for-purpose of clinical data in German DIC settings.
    OBJECTIVE: The study aims (1) to capture and discuss how MIRACUM DICs evaluate and enhance the fitness-for-purpose of observational health care data and examine the alignment with existing recommendations and (2) to identify the requirements for designing and implementing a computer-assisted solution to evaluate EHR data fitness within MIRACUM DICs.
    METHODS: A qualitative approach was followed using an open-ended survey across DICs of 10 German university hospitals affiliated with MIRACUM. Data were analyzed using thematic analysis following an inductive qualitative method.
    RESULTS: All 10 MIRACUM DICs participated, with 17 participants revealing various approaches to assessing data fitness, including the 4-eyes principle and data consistency checks such as cross-system data value comparison. Common practices included a DUP-related feedback loop on data fitness and using self-designed dashboards for monitoring. Most experts had a computer science background and a master\'s degree, suggesting strong technological proficiency but potentially lacking clinical or statistical expertise. Nine key requirements for a computer-assisted solution were identified, including flexibility, understandability, extendibility, and practicability. Participants used heterogeneous data repositories for evaluating data quality criteria and practical strategies to communicate with research and clinical teams.
    CONCLUSIONS: The study identifies gaps between current practices in MIRACUM DICs and existing recommendations, offering insights into the complexities of assessing and reporting clinical data fitness. Additionally, a tripartite modular framework for fitness-for-purpose assessment was introduced to streamline the forthcoming implementation. It provides valuable input for developing and integrating an automated solution across multiple locations. This may include statistical comparisons to advanced machine learning algorithms for operationalizing frameworks such as the 3×3 data quality assessment framework. These findings provide foundational evidence for future design and implementation studies to enhance data quality assessments for specific DUPs in observational health care settings.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:元数据描述并提供其他数据的上下文,在实现可查找性方面发挥着关键作用,可访问性,互操作性,和可重用性(FAIR)数据原则。通过提供全面且机器可读的数字资源描述,元数据使机器和人类用户能够无缝地发现,access,集成,并在不同的平台和应用程序中重用数据或内容。然而,人口健康数据的现有元数据的有限可访问性和机器可解释性阻碍了有效的数据发现和重用。
    目标:为了应对这些挑战,我们提出了一个使用标准化格式的综合框架,词汇,以及使人口健康数据机器可读的协议,显着提高他们的公平度并实现无缝发现,access,以及跨不同平台和研究应用的集成。
    方法:该框架实现了3阶段方法。第一阶段是数据文档计划(DDI)集成,这涉及利用DDI代码簿元数据以及数据和相关资产的详细信息文档,同时确保透明度和全面性。第二阶段是观察性医疗结果伙伴关系(OMOP)通用数据模型(CDM)标准化。在这个阶段,数据在OMOPCDM中得到协调和标准化,促进跨异构数据集的统一分析。第三阶段涉及Schema.org和JavaScript对象表示法(JSON-LD)的集成,其中使用Schema.org实体生成机器可读元数据,并使用JSON-LD嵌入数据中,提高机器和人类用户的可发现性和理解力。我们使用马拉维和肯尼亚的综合疾病监测和反应(IDSR)数据展示了这三个阶段的实施情况。
    结果:我们框架的实施显着提高了人口健康数据的公平性,通过与GoogleDatasetSearch等平台的无缝集成,提高了可发现性。采用标准化格式和协议简化了各种研究环境中的数据可访问性和集成,促进协作和知识共享。此外,使用机器可解释的元数据使研究人员能够有效地重用数据进行有针对性的分析和见解,从而最大限度地提高人口卫生资源的整体价值。JSON-LD代码可通过GitHub存储库访问,与JSON-LD集成的HTML代码可在实施网络上从研究实体网站共享人口信息。
    结论:采用机器可读的元数据标准对于确保人口健康数据的公平性至关重要。通过接受这些标准,组织可以增强不同资源的可见性,可访问性,和效用,带来更广泛的影响,特别是在低收入和中等收入国家。机器可读的元数据可以加速研究,改善医疗保健决策,并最终促进全球人口更好的健康结果。
    Metadata describe and provide context for other data, playing a pivotal role in enabling findability, accessibility, interoperability, and reusability (FAIR) data principles. By providing comprehensive and machine-readable descriptions of digital resources, metadata empower both machines and human users to seamlessly discover, access, integrate, and reuse data or content across diverse platforms and applications. However, the limited accessibility and machine-interpretability of existing metadata for population health data hinder effective data discovery and reuse.
    To address these challenges, we propose a comprehensive framework using standardized formats, vocabularies, and protocols to render population health data machine-readable, significantly enhancing their FAIRness and enabling seamless discovery, access, and integration across diverse platforms and research applications.
    The framework implements a 3-stage approach. The first stage is Data Documentation Initiative (DDI) integration, which involves leveraging the DDI Codebook metadata and documentation of detailed information for data and associated assets, while ensuring transparency and comprehensiveness. The second stage is Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) standardization. In this stage, the data are harmonized and standardized into the OMOP CDM, facilitating unified analysis across heterogeneous data sets. The third stage involves the integration of Schema.org and JavaScript Object Notation for Linked Data (JSON-LD), in which machine-readable metadata are generated using Schema.org entities and embedded within the data using JSON-LD, boosting discoverability and comprehension for both machines and human users. We demonstrated the implementation of these 3 stages using the Integrated Disease Surveillance and Response (IDSR) data from Malawi and Kenya.
    The implementation of our framework significantly enhanced the FAIRness of population health data, resulting in improved discoverability through seamless integration with platforms such as Google Dataset Search. The adoption of standardized formats and protocols streamlined data accessibility and integration across various research environments, fostering collaboration and knowledge sharing. Additionally, the use of machine-interpretable metadata empowered researchers to efficiently reuse data for targeted analyses and insights, thereby maximizing the overall value of population health resources. The JSON-LD codes are accessible via a GitHub repository and the HTML code integrated with JSON-LD is available on the Implementation Network for Sharing Population Information from Research Entities website.
    The adoption of machine-readable metadata standards is essential for ensuring the FAIRness of population health data. By embracing these standards, organizations can enhance diverse resource visibility, accessibility, and utility, leading to a broader impact, particularly in low- and middle-income countries. Machine-readable metadata can accelerate research, improve health care decision-making, and ultimately promote better health outcomes for populations worldwide.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:数据缺失对个体连续血糖监测(CGM)数据的影响未知,但会影响患者的临床决策。
    目的:我们旨在研究数据丢失对来自连续血糖监测仪的个体患者血糖指标的影响,并评估其对临床决策的影响。
    方法:使用FreeStyleLibre传感器(雅培糖尿病护理)收集1型和2型糖尿病患者的CGM数据。我们从每个患者中选择了7-28天的24小时连续数据,没有任何缺失值。为了模拟真实世界的数据丢失,从5%到50%的缺失数据被引入到数据集中.从这个修改的数据集中,临床指标,包括低于范围的时间(TBR),TBR等级2(TBR2),和其他常见的血糖指标在有和没有数据丢失的数据集中计算。由于数据丢失而导致血糖指标相关偏差的记录,根据临床专家的判断,被定义为专家面板边界误差(εEPB)。这些误差表示为记录总数的百分比。研究了葡萄糖管理指标<53mmol/mol的记录错误。
    结果:共有84名患者在28天内完成了798次记录。5%-50%的数据丢失7-28天的记录,对于TBR,εEPB从798(0.0%)中的0到736(20.0%)中的147,而对于TBR2,从612(0.0%)中的0到408(5.4%)中的22。在14天录音的情况下,由于786例中的2例(0.3%)和522例中的32例(6.1%)的数据丢失,TBR和TBR2发作完全消失,分别。然而,消失的TBR和TBR2的初始值相对较小(<0.1%)。在葡萄糖管理指标<53mmol/mol的记录中,εEPB为9.6%持续14天,数据损失为30%。
    结论:在14天的CGM记录中,数据丢失最多30%,缺失数据对各种血糖指标的临床解释影响最小.
    背景:ClinicalTrials.govNCT05584293;https://clinicaltrials.gov/study/NCT05584293。
    BACKGROUND: The impact of missing data on individual continuous glucose monitoring (CGM) data is unknown but can influence clinical decision-making for patients.
    OBJECTIVE: We aimed to investigate the consequences of data loss on glucose metrics in individual patient recordings from continuous glucose monitors and assess its implications on clinical decision-making.
    METHODS: The CGM data were collected from patients with type 1 and 2 diabetes using the FreeStyle Libre sensor (Abbott Diabetes Care). We selected 7-28 days of 24 hours of continuous data without any missing values from each individual patient. To mimic real-world data loss, missing data ranging from 5% to 50% were introduced into the data set. From this modified data set, clinical metrics including time below range (TBR), TBR level 2 (TBR2), and other common glucose metrics were calculated in the data sets with and that without data loss. Recordings in which glucose metrics deviated relevantly due to data loss, as determined by clinical experts, were defined as expert panel boundary error (εEPB). These errors were expressed as a percentage of the total number of recordings. The errors for the recordings with glucose management indicator <53 mmol/mol were investigated.
    RESULTS: A total of 84 patients contributed to 798 recordings over 28 days. With 5%-50% data loss for 7-28 days recordings, the εEPB varied from 0 out of 798 (0.0%) to 147 out of 736 (20.0%) for TBR and 0 out of 612 (0.0%) to 22 out of 408 (5.4%) recordings for TBR2. In the case of 14-day recordings, TBR and TBR2 episodes completely disappeared due to 30% data loss in 2 out of 786 (0.3%) and 32 out of 522 (6.1%) of the cases, respectively. However, the initial values of the disappeared TBR and TBR2 were relatively small (<0.1%). In the recordings with glucose management indicator <53 mmol/mol the εEPB was 9.6% for 14 days with 30% data loss.
    CONCLUSIONS: With a maximum of 30% data loss in 14-day CGM recordings, there is minimal impact of missing data on the clinical interpretation of various glucose metrics.
    BACKGROUND: ClinicalTrials.gov NCT05584293; https://clinicaltrials.gov/study/NCT05584293.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    错过预约可能导致治疗延误和不良后果。远程医疗可以改善预约完成,因为它解决了面对面访问的障碍,如儿童保育和交通。这项研究在城市学术健康科学中心的大量患者中比较了使用远程医疗和现场护理的预约完成情况。
    我们对电子健康记录数据进行了一项回顾性队列研究,以确定远程医疗预约与现场护理预约相比是否具有更高的完成几率,2021年1月1日和2023年4月30日。数据来自南佛罗里达大学(USF),一个为坦帕服务的大型学术健康科学中心,FL,和周边社区。我们根据年龄实施了1:1的倾向评分匹配,性别,种族,访问类型,和Charlson合并症指数(CCI)。
    匹配的队列包括87.376个约会,具有不同的患者人口统计学。完成的远程医疗预约的百分比比完成的亲自护理预约的百分比高出9.2个百分点(73.4%对64.2%,P<.001)。与预约完成相关的远程医疗与现场护理的调整比值比为1.64(95%CI,1.59-1.69,P<.001),这表明在控制其他因素时,远程医疗预约的完成几率比亲自护理预约高64%。
    这项队列研究表明,远程医疗预约比亲自护理预约更有可能完成,不管人口统计学如何,合并症,付款类型,或距离。
    远程医疗预约比面对面医疗预约更有可能完成。
    UNASSIGNED: Missed appointments can lead to treatment delays and adverse outcomes. Telemedicine may improve appointment completion because it addresses barriers to in-person visits, such as childcare and transportation. This study compared appointment completion for appointments using telemedicine versus in-person care in a large cohort of patients at an urban academic health sciences center.
    UNASSIGNED: We conducted a retrospective cohort study of electronic health record data to determine whether telemedicine appointments have higher odds of completion compared to in-person care appointments, January 1, 2021, and April 30, 2023. The data were obtained from the University of South Florida (USF), a large academic health sciences center serving Tampa, FL, and surrounding communities. We implemented 1:1 propensity score matching based on age, gender, race, visit type, and Charlson Comorbidity Index (CCI).
    UNASSIGNED: The matched cohort included 87 376 appointments, with diverse patient demographics. The percentage of completed telemedicine appointments exceeded that of completed in-person care appointments by 9.2 points (73.4% vs 64.2%, P < .001). The adjusted odds ratio for telemedicine versus in-person care in relation to appointment completion was 1.64 (95% CI, 1.59-1.69, P < .001), indicating that telemedicine appointments are associated with 64% higher odds of completion than in-person care appointments when controlling for other factors.
    UNASSIGNED: This cohort study indicated that telemedicine appointments are more likely to be completed than in-person care appointments, regardless of demographics, comorbidity, payment type, or distance.
    UNASSIGNED: Telemedicine appointments are more likely to be completed than in-person healthcare appointments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    复杂社会技术系统的运行环境具有内在的不确定性,要求不断协调重组,以适应动态的情境需求。然而,人为因素和人体工程学领域的协调变化主要是使用静态方法进行研究的,忽略每时每刻的调整。在目前的研究中,我们通过使用主题来解决协调重组,一种数字分析工具,能够从多层角度可视化和探索协调重组。我们研究了美国宇航局阿波罗13号任务期间协调模式的重组,揭示了从稳定的显著转变,常规操作中的长持续时间“协调中心”到危机情况下的短持续时间模式。此外,结果突出了在互惠和单向协调之间灵活切换的重要性,随着角色分配的加强。这项研究强调了如何通过THEME等数字技术探索时间性敏感现象,如协调,提高我们对复杂系统中事件分析和弹性性能的理解。
    The operational environment of complex sociotechnical systems is inherently uncertain, demanding constant coordination restructuring to adapt to dynamic situational demands. However, coordination changes in the Human Factors and Ergonomics Field have primarily been studied using static methods, overlooking moment-by-moment adjustments. In the current study, we address coordination restructuring by using THEME, a digital analytical tool capable of visualising and exploring coordination restructuring from a multi-layered perspective. We examine restructuring in coordination patterns during NASA\'s Apollo 13 Mission, revealing significant shifts from stable, long-duration \'coordination hubs\' in routine operations to shorter-duration patterns during a crisis situation. Additionally, the results highlight the importance of flexible switching between reciprocal and one-directed coordination, along with enhanced role distribution. This study underscores how exploring temporality-sensitive phenomena like coordination through digital technologies such as THEME, advances our understanding of incident analysis and resilient performance within complex systems.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:医学知识图谱提供了可解释的决策支持,帮助临床医生提供及时的诊断和治疗建议。然而,在现实世界的临床实践中,患者前往不同的医院寻求各种医疗服务,导致不同医院的患者数据分散。由于数据安全问题,数据碎片化限制了知识图的应用,因为单医院数据无法为生成精确的决策支持和全面的解释提供完整的证据。研究知识图谱系统多中心集成的新方法,信息敏感的医疗环境,使用零散的患者记录进行决策支持,同时保持数据隐私和安全性。
    目的:本研究旨在提出一种面向电子健康记录(EHR)的知识图谱系统,用于与多中心零散的患者医疗数据进行协作推理,同时保护数据隐私。
    方法:该研究引入了EHR知识图谱框架和新的协作推理过程,用于利用多中心碎片信息。该系统部署在每个医院中,并使用统一的语义结构和观察医疗结果伙伴关系(OMOP)词汇来标准化本地EHR数据集。该系统将本地EHR数据转换为语义格式并执行语义推理以生成中间推理结果。生成的中间发现使用hypernym概念来分离原始医疗数据。中间发现和哈希加密的患者身份通过区块链网络进行同步。多中心中间发现进行了最终推理和临床决策支持,而无需收集原始EHR数据。
    结果:通过一项应用研究对该系统进行了评估,该研究涉及利用多中心片段化的EHR数据来提醒非肾脏病临床医生注意被忽略的慢性肾脏病(CKD)患者。该研究涵盖了3家医院的非肾病科1185名患者。患者至少访问了两家医院。其中,通过使用多中心EHR数据进行协作推理,确定124例患者符合CKD诊断标准,而单独来自个别医院的数据不能促进这些患者CKD的识别.临床医生的评估表明,78/91(86%)患者为CKD阳性。
    结论:所提出的系统能够有效地利用多中心片段化的EHR数据进行临床应用。应用研究显示了该系统具有迅速和全面的决策支持的临床优势。
    BACKGROUND: The medical knowledge graph provides explainable decision support, helping clinicians with prompt diagnosis and treatment suggestions. However, in real-world clinical practice, patients visit different hospitals seeking various medical services, resulting in fragmented patient data across hospitals. With data security issues, data fragmentation limits the application of knowledge graphs because single-hospital data cannot provide complete evidence for generating precise decision support and comprehensive explanations. It is important to study new methods for knowledge graph systems to integrate into multicenter, information-sensitive medical environments, using fragmented patient records for decision support while maintaining data privacy and security.
    OBJECTIVE: This study aims to propose an electronic health record (EHR)-oriented knowledge graph system for collaborative reasoning with multicenter fragmented patient medical data, all the while preserving data privacy.
    METHODS: The study introduced an EHR knowledge graph framework and a novel collaborative reasoning process for utilizing multicenter fragmented information. The system was deployed in each hospital and used a unified semantic structure and Observational Medical Outcomes Partnership (OMOP) vocabulary to standardize the local EHR data set. The system transforms local EHR data into semantic formats and performs semantic reasoning to generate intermediate reasoning findings. The generated intermediate findings used hypernym concepts to isolate original medical data. The intermediate findings and hash-encrypted patient identities were synchronized through a blockchain network. The multicenter intermediate findings were collaborated for final reasoning and clinical decision support without gathering original EHR data.
    RESULTS: The system underwent evaluation through an application study involving the utilization of multicenter fragmented EHR data to alert non-nephrology clinicians about overlooked patients with chronic kidney disease (CKD). The study covered 1185 patients in nonnephrology departments from 3 hospitals. The patients visited at least two of the hospitals. Of these, 124 patients were identified as meeting CKD diagnosis criteria through collaborative reasoning using multicenter EHR data, whereas the data from individual hospitals alone could not facilitate the identification of CKD in these patients. The assessment by clinicians indicated that 78/91 (86%) patients were CKD positive.
    CONCLUSIONS: The proposed system was able to effectively utilize multicenter fragmented EHR data for clinical application. The application study showed the clinical benefits of the system with prompt and comprehensive decision support.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:放疗指南的依从性对于维持治疗质量和一致性至关重要,特别是在大多数治疗发生的非试验患者环境中。该研究旨在评估指南更改对治疗计划实践的影响,并将手动注册数据的准确性与治疗计划数据进行比较。
    方法:这项研究利用了DBCGRTNation队列,丹麦的乳腺癌放射治疗数据集,评估2008年至2016年对指南的遵守情况。该队列包括7448例高危乳腺癌患者。国家准则的变化包括,分馏,引入呼吸门控,乳腺内淋巴结的照射,在描绘实践中使用同时集成的增强技术和左前下降冠状动脉的纳入。结构名称映射的方法,侧向性检测,检测人群平均肺容积的时间变化,和剂量评估进行了介绍和应用。从丹麦乳腺癌数据库获得手动登记的治疗特征数据用于比较。
    结果:研究发现,丹麦放疗中心立即且一致地遵守指南变更。指南实施之前的治疗实践已记录在案,并显示各中心之间存在差异。对于某些措施,手动注册数据与实际治疗计划数据之间的差异高达10%。
    结论:可以在常规治疗数据中检测到国家指南的变化,具有高度的合规性和较短的实施时间。与医疗登记数据相比,从治疗计划数据文件提取的数据提供了更准确和详细的治疗和指南依从性表征。
    Guideline adherence in radiotherapy is crucial for maintaining treatment quality and consistency, particularly in non-trial patient settings where most treatments occur. The study aimed to assess the impact of guideline changes on treatment planning practices and compare manual registry data accuracy with treatment planning data.
    This study utilised the DBCG RT Nation cohort, a collection of breast cancer radiotherapy data in Denmark, to evaluate adherence to guidelines from 2008 to 2016. The cohort included 7448 high-risk breast cancer patients. National guideline changes included, fractionation, introduction of respiratory gating, irradiation of the internal mammary lymph nodes, use of the simultaneous integrated boost technique and inclusion of the Left Anterior Descending coronary artery in delineation practice. Methods for structure name mapping, laterality detection, detection of temporal changes in population mean lung volume, and dose evaluation were presented and applied. Manually registered treatment characteristic data was obtained from the Danish Breast Cancer Database for comparison.
    The study found immediate and consistent adherence to guideline changes across Danish radiotherapy centres. Treatment practices before guideline implementation were documented and showed a variation among centres. Discrepancies between manual registry data and actual treatment planning data were as high as 10% for some measures.
    National guideline changes could be detected in the routine treatment data, with a high degree of compliance and short implementation time. Data extracted from treatment planning data files provides a more accurate and detailed characterisation of treatments and guideline adherence than medical register data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目标第一反应者需要急诊医院护理的精神健康事件的强制性报告包含有关患者及其需求的丰富信息。在昆士兰州(澳大利亚),紧急检查机构(EEA)中包含的许多信息仍未使用。我们提出并演示了一种方法,该方法可以提取和翻译诸如EEA之类的报告中所包含的重要信息,并将其用于调查严重心理健康事件发生率的极端倾向。方法该方法结合临床,人口统计学,空间和自由文本信息到一个单一的数据集合。对数据进行探索性分析以进行空间模式识别,从而建立了观察性流行病学模型,用于分析EEA发作的最大空间复发。结果情绪分析显示,在EEA演示文稿中,医院和卫生服务(HHS)#4地区的积极情绪比例最低(18%),而HHS#1地区的积极情绪比例为33%,这表明强制性自由文本内在的情绪空间差异需要更详细的分析。在邮政编码地理一级,我们发现EEA的最大空间复发变化与情绪的空间范围(0.29,p<0.001)和邮政编码参考的性别比(0.45,p=0.01)显着正相关。情绪的波动与EEA发作的极端复发显着相关。映射时的预测(概率)发病率反映了这种相关性。结论本文证明了整合的有效性,机器提取,人类情绪(作为潜在的替代)与传统的暴露变量为基础的方法的心理健康空间流行病学。从信息学驱动的流行病学观察中获得的这些见解可能会为卫生系统资源的战略分配提供信息,以满足最高水平的需求,并提高精神病人的护理标准,同时加强他们的安全和人道治疗和管理。
    Objective First responders\' mandatory reports of mental health episodes requiring emergency hospital care contain rich information about patients and their needs. In Queensland (Australia) much of the information contained in Emergency Examination Authorities (EEAs) remains unused. We propose and demonstrate a methodology to extract and translate vital information embedded in reports like EEAs and to use it to investigate the extreme propensity of incidence of serious mental health episodes. Methods The proposed method integrates clinical, demographic, spatial and free text information into a single data collection. The data is subjected to exploratory analysis for spatial pattern recognition leading to an observational epidemiology model for the association of maximum spatial recurrence of EEA episodes. Results Sentiment analysis revealed that among EEA presentations hospital and health service (HHS) region #4 had the lowest proportion of positive sentiments (18 %) compared to 33 % for HHS region #1 pointing to spatial differentiation of sentiments immanent in mandated free text which required more detailed analysis. At the postcode geographical level, we found that variation in maximum spatial recurrence of EEAs was significantly positively associated with spatial range of sentiments (0.29, p < 0.001) and the postcode-referenced sex ratio (0.45, p = 0.01). The volatility of sentiments significantly correlated with extremes of recurrence of EEA episodes. The predicted (probabilistic) incidence rate when mapped reflected this correlation. Conclusions The paper demonstrates the efficacy of integrating, machine extracted, human sentiments (as potential surrogates) with conventional exposure variables for evidence-based methods for mental health spatial epidemiology. Such insights from informatics-driven epidemiological observations may inform the strategic allocation of health system resources to address the highest levels of need and to improve the standard of care for mental patients while also enhancing their safe and humane treatment and management.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号