Emergency medicine

急诊医学
  • 文章类型: Journal Article
    OBJECTIVE: People experiencing homelessness and marginalization face considerable barriers to accessing healthcare services. Increased reliance on technology within healthcare has exacerbated these inequities. We evaluated a hospital-based prescription phone program aimed to reduce digital health inequities and improve access to services among marginalized patients in Emergency Departments. We examined the perceived outcomes of the program and the contextual barriers and facilitators affecting outcomes.
    METHODS: We conducted a constructivist qualitative program evaluation at two urban, academic hospitals in Toronto, Ontario. We interviewed 12 healthcare workers about their perspectives on program implementation and outcomes and analyzed the interview data using reflexive thematic analysis.
    RESULTS: Our analyses generated five interrelated program outcomes: building trust with patients, facilitating independence in healthcare, bridging sectors of care, enabling equitable care for marginalized populations, and mitigating moral distress among healthcare workers. Participants expressed that phone provision is critical for adequately serving patients who face barriers to accessing health and social services, and for supporting healthcare workers who often lack resources to adequately serve these patients. We identified key contextual enablers and challenges that may influence program outcomes and future implementation efforts.
    CONCLUSIONS: Our findings suggest that providing phones to marginalized patient populations may address digital and social health inequities; however, building trusting relationships with patients, understanding the unique needs of these populations, and operating within a biopsychosocial model of health are key to program success.
    UNASSIGNED: OBJECTIFS: Les personnes sans abri et marginalisées font face à des obstacles considérables pour accéder aux services de santé. Le recours accru à la technologie dans les soins de santé a exacerbé ces inégalités. Nous avons évalué un programme de téléphones d’ordonnance en milieu hospitalier visant à réduire les inégalités en santé numérique et à améliorer l’accès aux services chez les patients marginalisés des services d’urgence. Nous avons examiné les résultats perçus du programme et les obstacles contextuels et facilitateurs qui influent sur les résultats. MéTHODES: Nous avons mené une évaluation qualitative constructiviste de programmes dans deux hôpitaux universitaires urbains de Toronto, en Ontario. Nous avons interviewé 12 travailleurs de la santé au sujet de leurs points de vue sur la mise en œuvre et les résultats du programme et analysé les données des entrevues au moyen d’une analyse thématique réflexive. RéSULTATS: Nos analyses ont généré cinq résultats de programme interdépendants : établir la confiance avec les patients, faciliter l’indépendance dans les soins de santé, rapprocher les secteurs de soins, permettre des soins équitables pour les populations marginalisées et atténuer la détresse morale chez les travailleurs de la santé. Les participants ont indiqué que la fourniture de services téléphoniques est essentielle pour servir adéquatement les patients qui font face à des obstacles à l’accès aux services de santé et aux services sociaux, et pour soutenir les travailleurs de la santé qui manquent souvent de ressources pour servir adéquatement ces patients. Nous avons cerné les principaux catalyseurs contextuels et les défis qui pourraient influer sur les résultats du programme et les efforts de mise en œuvre futurs.
    CONCLUSIONS: Nos résultats suggèrent que la fourniture de téléphones aux populations de patients marginalisés peut remédier aux inégalités numériques et sociales en matière de santé; cependant, établir des relations de confiance avec les patients, comprendre les besoins uniques de ces populations, La réussite du programme repose sur le fait de fonctionner dans un modèle biopsychosocial de la santé.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: English Abstract
    BACKGROUND: The quality and promptness of prehospital care for major trauma patients are vital in order to lower their high mortality rate. However, the effectiveness of this response in Portugal is unknown. The objective of this study was to analyze response times and interventions for major trauma patients in the central region of Portugal.
    METHODS: This was a retrospective, descriptive study, using the 2022 clinical records of the National Institute of Medical Emergency\'s differentiated resources. Cases of death prior to arrival at the hospital and other non-transport situations were excluded. Five-time intervals were determined, among which are the response time (T1, between activation and arrival at the scene), on-scene time (T2), and transportation time (T5; between the decision to transport and arrival at the emergency service). For each ambulance type, averages and dispersion times were calculated, as well as the proportion of cases in which the nationally and internationally recommended times were met. The frequency of recording six key interventions was also assessed.
    RESULTS: Of the 3366 records, 602 were eliminated (384 due to death), resulting in 2764 cases: nurse-technician ambulance (SIV) = 36.0%, physician- nurse ambulance (VMER) = 62.2% and physician-nurse helicopter = 1.8%. In a very large number of records, it was not possible to determine prehospital care times: for example, transport time (T5) could be determined in only 29%, 13% and 8% of cases, respectively for SIV, VMER and helicopter. The recommended time for stabilization (T2 ≤ 20 min) was met in 19.8% (SIV), 36.5% (VMER) and 18.2% (helicopter). Time to hospital (T5 ≤ 45 min) was achieved in 80.0% (SIV), 93.1% (VMER) and 75.0% (helicopter) of the records. The administration of analgesia (42% in SIV) and measures to prevent hypothermia (23.5% in SIV) were the most recorded interventions.
    CONCLUSIONS: There was substantial missing data on statuses and a lack of information in the records, especially in the VMER and helicopter. According to the records, the time taken to stabilize the victim on-scene often exceeded the recommendations, while the time taken to transport them to the hospital tended to be within the recommendations.
    Introdução: A qualidade e rapidez do socorro pré-hospitalar à pessoa vítima de trauma major é vital para diminuir a sua elevada mortalidade. Contudo, desconhece-se a efetividade desta resposta em Portugal. O objetivo deste estudo foi analisar os tempos de resposta e as intervenções realizadas às vítimas de trauma major na região centro de Portugal. Métodos: Estudo retrospetivo, descritivo, utilizando os registos clínicos de 2022 dos meios diferenciados do Instituto Nacional de Emergência Médica. Casos de óbito pré-chegada ao hospital e outras situações de não transporte foram excluídos. Determinaram-se cinco tempos, entre os quais o tempo de resposta (T1, decorrente entre acionamento e chegada ao local), o tempo no local (T2) e o tempo de transporte (T5, intervalo entre a decisão de transporte e a chegada ao serviço de urgência). Foram calculadas médias e medidas de dispersão para cada meio, bem como a proporção de casos em que foram cumpridos os tempos recomendados nacional e internacionalmente. Avaliou-se também a frequência de registo de seis intervenções chave. Resultados: Dos 3366 registos, eliminaram-se 602 (384 por óbito), resultando em 2764 casos [suporte imediato de vida (SIV) = 36,0%, viaturas médicas de emergência e reanimação (VMER) = 62,2%, helicóptero de emergência médica (HEM) = 1,8%]. Num elevado número de registos não foi possível determinar tempos de socorro: por exemplo, o tempo de transporte (T5) foi determinável em apenas 29%, 13%, e 8% dos casos, respetivamente para SIV, VMER e HEM. O tempo recomendado para a estabilização (T2 ≤ 20 min), foi cumprido em 19,8% (SIV), 36,5% (VMER), e 18,2% (HEM) dos regis- tos. Já o tempo de transporte (T5 ≤ 45 min) foi cumprido em 80,0% (SIV), 93,1% (VMER) e 75,0% (HEM) dos registos (avaliáveis). A administração de analgesia (42% na SIV) e as medidas de prevenção de hipotermia (23,5% na SIV) foram as intervenções mais registadas. Conclusão: Observaram-se muitos status omissos e falta de informação nos registos, sobretudo na VMER e HEM. De acordo com os registos, o tempo no local superou frequentemente as recomendações, enquanto o tempo de transporte tende a estar dentro das normas.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在急诊科难以进行界标引导的IV入路的患者中,超声用于外周静脉(PIV)插管。Esmarch绷带在目标肢体上的远端到近端应用已被建议作为增加静脉尺寸和易于插管的方法。
    方法:这项研究是一项单盲交叉随机对照试验,比较了超声下的贵宾静脉大小,并使用标准IV止血带(“止血带+Esmarch”)与使用标准IV止血带相比单独使用标准IV止血带。还将止血带+Esmarch的参与者不适与单独的标准IV止血带进行了比较。
    结果:使用22名健康志愿者测量有无Esmarch绷带的贵重静脉大小。两组的贵宾静脉大小没有差异,止血带+Esmarch组的平均直径为6.0±1.5mm,对照组为6.0±1.4mm,p=0.89。两组之间的不适评分(从0到10)不同,止血带+Esmarch组的平均不适评分为2.1,标准IV止血带单独组的平均不适评分为1.1(p<0.001)。
    结论:这项研究表明,使用Esmarch绷带不会增加健康志愿者的贵重静脉大小,但与不适的轻度增加有关。
    BACKGROUND: Ultrasound is used for peripheral intravenous (PIV) cannulation in patients with difficult landmark-guided IV access in the Emergency Department. Distal-to-proximal application of an Esmarch bandage on the target limb has been suggested as a method for increasing vein size and ease of cannulation.
    METHODS: This study was a single-blinded crossover randomized controlled trial comparing basilic vein size under ultrasound with use of an Esmarch bandage in addition to standard IV tourniquet (\"tourniquet + Esmarch\") compared to use of a standard IV tourniquet alone. Participant discomfort with the tourniquet + Esmarch was also compared to that with standard IV tourniquet alone.
    RESULTS: Twenty-two healthy volunteers were used to measure basilic vein size with and without the Esmarch bandage. There was no difference in basilic vein size between the two groups, with a mean diameter of 6.0 ± 1.5 mm in the tourniquet + Esmarch group and 6.0 ± 1.4 mm in the control group, p = 0.89. Discomfort score (from 0 to 10) was different between the groups, with a mean discomfort score of 2.1 in the tourniquet + Esmarch group and 1.1 in the standard IV tourniquet alone group (p < 0.001).
    CONCLUSIONS: This study showed that the use of an Esmarch bandage does not increase basilic vein size in healthy volunteers but is associated with a mild increase in discomfort.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    创伤中心,集线器,是一家高度专业化的医院,在一级医院稳定后进行复杂的重大创伤管理,说话。尽管在美国,该组织证明了其在死亡率方面的有效性,在意大利的背景下,可用数据有限。2018年9月30日,比萨大学医院正式成立创伤中心,优化急诊科(ED)组织,以保证最高标准的护理。这项研究的目的是证明新模型带来了更好的结果。我们对超过24个月的1154例主要创伤进行了比较回顾性研究:前12个月(576例患者)对应于创伤中心引入之前的时期,和以下12(457名患者)到下一个时期。结果表明,直升机增加了更大的动力学和主要集中化(p<0.001,p<0.006)。使用ABCDE算法进行了系统评估,在最近一段时间内,患者数量较多,从38.4%到80.3%(p<0.001)。创伤中心引入后,急诊医生进行的创伤超声检查(FAST)重点评估增加,p值<0.001。数据显示,引入创伤中心后,工作人员的ATLS认证从51.9%增加到71.4%,早期和晚期死亡率降低(p值0.05和<0.01)。更少的患者需要强化和手术治疗,住院时间短。结果表明,在意大利背景下,创伤中心的组织在成果方面具有优势。
    The Trauma Center, Hub, is a highly specialized hospital indicated for complex major trauma management after stabilization at a 1st level hospital, Spoke. Although in the United States this organization demonstrated its effectiveness in mortality, in the Italian context, data available are limited. On 30 September 2018, the University Hospital of Pisa formalized the introduction of the Trauma Center, optimizing Emergency Department (ED) organization to guarantee the highest standard of care. The aim of this study was to demonstrate that the new model led better outcomes. We conducted a comparative retrospective study on 1154 major traumas over 24 months: the first 12 months (576 patients) correspond to the period before Trauma Center introduction, and the following 12 (457 patients) to the subsequent period. Results showed increase in greater dynamics and primary centralization by helicopter (p < 0.001, p 0.006). A systematic assessment with ABCDE algorithm was performed in a higher number of patients in the most recent period, from 38.4% to 80.3% (p < 0.001). Focused Assessment with Sonography for Trauma (FAST) performed by the emergency doctor increased after Trauma Center introduction, p value < 0.001. The data show an increase of ATLS certification among staff from 51.9 to 71.4% and a reduction in early and late mortality after the Trauma Center introduction (p value 0.05 and < 0.01). Fewer patients required intensive and surgical treatments, with a shorter hospital stay. The results demonstrate the advantage in terms of outcomes in the organization of the Trauma Center in the Italian context.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    急诊科(ED)的新护理流程的实施和可持续性很困难。我们描述了在ED中实施老年护理流程的经验,这些流程提高了他们对老年急诊科认证(GEDA)计划的认证水平。这些ED可以为采用和维持循证老年护理指南提供模型。
    我们对老年ED护士和医师领导进行了定性访谈,以监督他们的老年ED认证流程。面试指南基于实施研究综合框架(CFIR),一个由影响循证干预措施实施的综合因素组成的框架。我们使用归纳分析从访谈和演绎分析中阐明关键主题,以将主题映射到CFIR构造。
    在2023年3月1日之前升级认证状态的19个ED中的15个的临床医师领导人参加了采访。提升认证水平的动机集中在改善患者护理(73%)和获得认可(56%)。选择特定护理流程的基本原理通常与可行性(40%)和将流程集成到电子健康记录中的能力(33%)有关,而不是与特定地点的患者需求(20%)有关。确定了一些共同的实施经验:(1)来自更大的卫生系统或慈善事业的资金至关重要;(2)将老年ED指南转化为临床实践对临床医师领导来说是一项挑战;(3)一线ED员工之间存在动机障碍;(4)鉴于一线ED员工的流失和离职,需要对员工进行纵向教育;(5)电子健康记录促进了老年筛查的实施。
    老年ED认证涉及大量时间,资源分配,和纵向员工承诺。追求老年认证的ED平衡了改善患者护理的愿望和资源可用性,以实施新的护理流程和相互竞争的优先事项。
    UNASSIGNED: Implementation and sustainability of new care processes in emergency departments (EDs) is difficult. We describe experiences of implementing geriatric care processes in EDs that upgraded their accreditation level for the Geriatric Emergency Department Accreditation (GEDA) program. These EDs can provide a model for adopting and sustaining guidelines for evidence-based geriatric care.
    UNASSIGNED: We performed qualitative interviews with geriatric ED nurse and physician leaders overseeing their ED\'s geriatric accreditation processes. The interview guide was based on the Consolidated Framework for Implementation Research (CFIR), a framework consisting of a comprehensive set of factors that impact implementation of evidence-based interventions. We used inductive analysis to elucidate key themes from interviews and deductive analysis to map themes onto CFIR constructs.
    UNASSIGNED: Clinician leaders from 15 of 19 EDs that upgraded accreditation status by March 1, 2023 participated in interviews. Motivations to upgrade accreditation level centered on improving patient care (73%) and achieving recognition (56%). Rationales for choosing specific care processes were more commonly related to feasibility (40%) and ability to integrate the processes into the electronic health record (33%) than to site-specific patient needs (20%). Several common experiences in implementation were identified: (1) financing from the larger health system or philanthropy was crucial; (2) translating the Geriatric ED Guidelines into clinical practice was challenging for clinician leaders; (3) motivational barriers existed among frontline ED staff; (4) longitudinal staff education was needed given frontline ED staff attrition and turnover; and (5) the electronic health record facilitated implementation of geriatric screenings.
    UNASSIGNED: Geriatric ED accreditation involves significant time, resource allocation, and longitudinal staff commitment. EDs pursuing geriatric accreditation balance aspirations to improve patient care with resource availability to implement new care processes and competing priorities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:跌倒是从老年护理院(RACH)转院的主要原因。然而,许多跌倒不会导致重大伤害,老年患者住院时会出现并发症。外联服务旨在通过提供护理来减少医院转院,对RACH居民的支持和评估。这项研究评估了一项针对跌倒后衰老患者的试点干预计划。
    方法:我们进行了前瞻性,在墨尔本的单一医疗保健网络中,对108个政府资助的RACH进行了为期5个月(2022年5月至9月)的试点实施的混合方法评估,澳大利亚。
    结果:共有123名居民(中位[四分位距]年龄:88[82,94]岁,女性:49%)纳入干预。大多数(n=116,94%)的居民在现场管理,不需要进一步的调查(n=80,69%)或治疗(n=63,54%)。在转介急诊室(ED)的七名居民中,两人入院,五人转回院舍。在转诊干预后的7天内,另外四名居民被转诊到急诊室,一名住院。定性反馈(n=40)包括与一般满意度主题相关的具体评论(n=20,50%),对员工的称赞(n=16,40%)和对全面性的认可(n=9,23%)。
    结论:实施专门的秋季评估小组以补充现有的由老年人主导的RACH评估服务,这意味着在现场管理了很高的合格居民率,随后住院的需求非常低。居民,家庭成员和护理人员对这项服务的满意度很高。
    OBJECTIVE: Falls are the leading cause of hospital transfer from residential aged care homes (RACHs). However, many falls do not result in significant injury, and ageing patients are exposed to complications while hospitalised. Inreach services are designed to reduce hospital transfer by providing care, support and assessment to residents at the RACH. This study evaluated a pilot inreach program targeting ageing patients following a fall.
    METHODS: We conducted a prospective, mixed methods evaluation of a 5-month (May-September 2022) pilot implementation across 108 government-funded RACHs within a single health-care network in Melbourne, Australia.
    RESULTS: A total of 123 residents (median [interquartile range] age: 88 [82, 94] years, female: 49%) were included in the intervention. The majority (n = 116, 94%) of residents were managed onsite and required no further investigation (n = 80, 69%) or treatment (n = 63, 54%). Among the seven residents referred to the emergency department (ED), two received hospital admission and five were transferred back to residential care. In the 7 days following referral to the intervention, four additional residents were referred to the ED and one received hospital admission. Qualitative feedback (n = 40) included specific comments relating to themes of general satisfaction (n = 20, 50%), compliments for staff (n = 16, 40%) and acknowledgement of comprehensiveness (n = 9, 23%).
    CONCLUSIONS: Implementation of a specialised fall assessment team to complement an existing geriatric-led RACH assessment service meant that a high rate of eligible residents were managed onsite, with very low need for subsequent hospitalisation. Residents, family members and caregivers expressed high rates of satisfaction with the service.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    社区获得性肺炎是急性住院的常见原因。在怀疑患有这种疾病的患者中识别患有社区获得性肺炎的患者可能是一个挑战,导致不必要的抗生素治疗。我们调查了循环肺损伤标志物表面活性蛋白D(SP-D),克雷布斯·冯·登隆根-6(KL-6),俱乐部细胞蛋白16(CC16)可以帮助识别急性入院时社区获得性肺炎患者.在这项多中心诊断准确性研究中,SP-D,对临时诊断为社区获得性肺炎的急性住院患者的血浆样品中的KL-6和CC16进行了定量。针对以下结果计算每个标记物的受试者操作者特征曲线下面积(AUC):专家小组指定的社区获得性肺炎患者的最终诊断,胸部CT的肺炎表现。分析了来自339名患者的血浆样品。社区获得性肺炎的患病率为63%。每种标记物针对最终诊断和胸部CT诊断的AUC范围在0.50和0.56之间。因此,SP-D,KL-6和CC16在急性住院患者中对社区获得性肺炎的诊断表现不佳。我们的发现表明,这些标记物无法轻易帮助医生确认或排除社区获得性肺炎。
    Community-acquired pneumonia is a common cause of acute hospitalisation. Identifying patients with community-acquired pneumonia among patients suspected of having the disease can be a challenge, which causes unnecessary antibiotic treatment. We investigated whether the circulatory pulmonary injury markers surfactant protein D (SP-D), Krebs von den Lungen-6 (KL-6), and Club cell protein 16 (CC16) could help identify patients with community-acquired pneumonia upon acute admission. In this multi-centre diagnostic accuracy study, SP-D, KL-6, and CC16 were quantified in plasma samples from acutely hospitalised patients with provisional diagnoses of community-acquired pneumonia. The area under the receiver operator characteristics curve (AUC) was calculated for each marker against the following outcomes: patients\' final diagnoses regarding community-acquired pneumonia assigned by an expert panel, and pneumonic findings on chest CTs. Plasma samples from 339 patients were analysed. The prevalence of community-acquired pneumonia was 63%. AUCs for each marker against both final diagnoses and chest CT diagnoses ranged between 0.50 and 0.56. Thus, SP-D, KL-6, and CC16 demonstrated poor diagnostic performance for community-acquired pneumonia in acutely hospitalised patients. Our findings indicate that the markers cannot readily assist physicians in confirming or ruling out community-acquired pneumonia.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Observational Study
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    发热婴儿有严重细菌感染(SBI)的风险,可能会危及生命。本研究旨在调查发热婴儿中延迟呈递与SBIs风险之间的关系。
    我们在2017年11月至2022年7月期间对新加坡儿科急诊科(ED)就诊的≤90天发热婴儿进行了前瞻性队列研究。我们将延迟呈现定义为从发烧开始>24小时到ED的呈现。我们比较了出现延迟的婴儿与没有出现延迟的婴儿的SBI比例,和他们的临床结果。我们还进行了多变量逻辑回归,以研究延迟呈现是否与SBI的存在独立相关。
    在分析的1911名发热婴儿中,198名婴儿(10%)出现延迟。出现延迟的发热婴儿更有可能患有SBIs(28.8%对[vs]16.3%,P<0.001)。延迟就诊的婴儿需要静脉注射抗生素的比例较高(64.1%vs51.9%,P=0.001)。在调整了年龄之后,性别和严重程度指数评分,延迟提示与SBI的存在独立相关(校正比值比[AOR]1.78,95%置信区间1.26~2.52,P<0.001).
    出现延迟的发热婴儿发生SBI的风险较高。一线临床医生在评估发热婴儿时应考虑到这一点。
    UNASSIGNED: Febrile young infants are at risk of serious bacterial infections (SBIs), which are potentially life-threatening. This study aims to investigate the association between delayed presentation and the risk of SBIs among febrile infants.
    UNASSIGNED: We performed a prospective cohort study on febrile infants ≤90 days old presenting to a Singapore paediatric emergency department (ED) between November 2017 and July 2022. We defined delayed presentation as presentation to the ED >24 hours from fever onset. We compared the proportion of SBIs in infants who had delayed presentation compared to those without, and their clinical outcomes. We also performed a multivariable logistic regression to study if delayed presentation was independently associated with the presence of SBIs.
    UNASSIGNED: Among 1911 febrile infants analysed, 198 infants (10%) had delayed presentation. Febrile infants with delayed presentation were more likely to have SBIs (28.8% versus [vs] 16.3%, P<0.001). A higher proportion of infants with delayed presentation required intravenous antibiotics (64.1% vs 51.9%, P=0.001). After adjusting for age, sex and severity index score, delayed presentation was independently associated with the presence of SBI (adjusted odds ratio [AOR] 1.78, 95% confidence interval 1.26-2.52, P<0.001).
    UNASSIGNED: Febrile infants with delayed presentation are at higher risk of SBI. Frontline clinicians should take this into account when assessing febrile infants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:大型语言模型(LLM)在各种医学领域都表现出令人印象深刻的表现,促使探索他们在急诊室(ED)分诊的高需求设置中的潜在效用。本研究评估了不同LLM和ChatGPT的分诊能力,基于LLM的聊天机器人,与受过专业培训的ED员工和未经培训的人员相比。我们进一步探讨了LLM响应是否可以指导未经培训的员工进行有效的分诊。
    目的:本研究旨在评估LLM和相关产品ChatGPT在ED分诊中与不同培训状态的人员相比的功效,并调查模型的反应是否可以提高未培训人员的分诊熟练程度。
    方法:由未经培训的医生对总共124个匿名病例进行了分类;当前可用的LLM的不同版本;ChatGPT;以及受过专业培训的评估者,他们随后根据曼彻斯特分诊系统(MTS)达成共识。原型插图改编自德国三级ED的案例。主要结果是评分者之间的协议水平,MTS级别的分配,通过二次加权科恩κ测量。还确定了过度和未充分就诊的程度。值得注意的是,使用零剂量方法提示ChatGPT的实例,而没有关于MTS的大量背景信息.测试的LLM包括原始GPT-4,Llama370B,双子座1.5和混合8x7b。
    结果:基于GPT-4的ChatGPT和未经培训的医生与专业评估者的共识分类基本一致(分别为κ=平均值0.67,SD0.037和κ=平均值0.68,SD0.056),显著超过基于GPT-3.5的ChatGPT的性能(κ=平均值0.54,SD0.024;P<.001)。当未经培训的医生使用此LLM进行第二意见分诊时,性能略有提高,但统计学上无统计学意义(κ=平均值0.70,SD0.047;P=0.97)。其他测试的LLM与基于GPT-4的ChatGPT相似或更差,或者显示出使用参数的奇怪分类行为。LLM和ChatGPT模型倾向于过度分类,而未受过训练的医生则不成熟。
    结论:WhileLLMandtheLLM-basedproductChatGPTdonotyetmatchprofessionallytrainedraters,他们最好的模型\'分诊熟练程度等于未经培训的ED医生。以目前的形式,因此,LLM或ChatGPT在ED分诊中没有表现出黄金标准的表现,在这项研究的背景下,当用作决策支持时,未能显著改善未经培训的医生分诊。较新的LLM版本相对于较旧版本的显着性能增强暗示了未来的改进与进一步的技术开发和特定的培训。
    BACKGROUND: Large language models (LLMs) have demonstrated impressive performances in various medical domains, prompting an exploration of their potential utility within the high-demand setting of emergency department (ED) triage. This study evaluated the triage proficiency of different LLMs and ChatGPT, an LLM-based chatbot, compared to professionally trained ED staff and untrained personnel. We further explored whether LLM responses could guide untrained staff in effective triage.
    OBJECTIVE: This study aimed to assess the efficacy of LLMs and the associated product ChatGPT in ED triage compared to personnel of varying training status and to investigate if the models\' responses can enhance the triage proficiency of untrained personnel.
    METHODS: A total of 124 anonymized case vignettes were triaged by untrained doctors; different versions of currently available LLMs; ChatGPT; and professionally trained raters, who subsequently agreed on a consensus set according to the Manchester Triage System (MTS). The prototypical vignettes were adapted from cases at a tertiary ED in Germany. The main outcome was the level of agreement between raters\' MTS level assignments, measured via quadratic-weighted Cohen κ. The extent of over- and undertriage was also determined. Notably, instances of ChatGPT were prompted using zero-shot approaches without extensive background information on the MTS. The tested LLMs included raw GPT-4, Llama 3 70B, Gemini 1.5, and Mixtral 8x7b.
    RESULTS: GPT-4-based ChatGPT and untrained doctors showed substantial agreement with the consensus triage of professional raters (κ=mean 0.67, SD 0.037 and κ=mean 0.68, SD 0.056, respectively), significantly exceeding the performance of GPT-3.5-based ChatGPT (κ=mean 0.54, SD 0.024; P<.001). When untrained doctors used this LLM for second-opinion triage, there was a slight but statistically insignificant performance increase (κ=mean 0.70, SD 0.047; P=.97). Other tested LLMs performed similar to or worse than GPT-4-based ChatGPT or showed odd triaging behavior with the used parameters. LLMs and ChatGPT models tended toward overtriage, whereas untrained doctors undertriaged.
    CONCLUSIONS: While LLMs and the LLM-based product ChatGPT do not yet match professionally trained raters, their best models\' triage proficiency equals that of untrained ED doctors. In its current form, LLMs or ChatGPT thus did not demonstrate gold-standard performance in ED triage and, in the setting of this study, failed to significantly improve untrained doctors\' triage when used as decision support. Notable performance enhancements in newer LLM versions over older ones hint at future improvements with further technological development and specific training.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号