Mistral

  • 文章类型: Journal Article
    背景:像ChatGPT这样的大型语言模型(LLM)已经变得越来越普遍。在医学上,许多潜在的领域出现了LLM可能提供附加值的地方。我们的研究重点是使用开源LLM替代品,如Llama3,Gemma,米斯特拉尔,和Mixtral从德国临床文献中提取医学参数。由于观察到非英语任务的研究差距,我们专注于德语。
    目的:评估开源LLM从德国临床文本中提取医学参数的有效性,特别关注心脏MRI报告中的心血管功能指标。
    方法:我们提取了14项心血管功能指标,包括左心室和右心室射血分数(LV-EF和RV-EF),来自497份不同配方的心脏磁共振成像(MRI)报告。我们的系统分析涉及评估拉玛3号,杰玛,米斯特拉尔,和混合模型在正确注释和命名实体识别(NER)准确性方面。
    结果:分析证实了在不同架构中具有高达95.4%的正确注释和99.8%的NER准确性的强大性能,尽管这些模型没有针对数据提取和德语进行明确的微调。
    结论:结果强烈建议使用开源LLM从临床文本中提取医学参数,包括德语,由于它们的高精度和有效性,即使没有具体的微调。
    BACKGROUND: Large Language Models (LLMs) like ChatGPT have become increasingly prevalent. In medicine, many potential areas arise where LLMs may offer added value. Our research focuses on the use of open-source LLM alternatives like Llama 3, Gemma, Mistral, and Mixtral to extract medical parameters from German clinical texts. We concentrate on German due to an observed gap in research for non-English tasks.
    OBJECTIVE: To evaluate the effectiveness of open-source LLMs in extracting medical parameters from German clinical texts, specially focusing on cardiovascular function indicators from cardiac MRI reports.
    METHODS: We extracted 14 cardiovascular function indicators, including left and right ventricular ejection fraction (LV-EF and RV-EF), from 497 variously formulated cardiac magnetic resonance imaging (MRI) reports. Our systematic analysis involved assessing the performance of Llama 3, Gemma, Mistral, and Mixtral models in terms of right annotation and named entity recognition (NER) accuracy.
    RESULTS: The analysis confirms strong performance with up to 95.4% right annotation and 99.8% NER accuracy across different architectures, despite the fact that these models were not explicitly fine-tuned for data extraction and the German language.
    CONCLUSIONS: The results strongly recommend using open-source LLMs for extracting medical parameters from clinical texts, including those in German, due to their high accuracy and effectiveness even without specific fine-tuning.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目标:评估和比较五种不同的人工智能(AI)聊天机器人GPT-4,克劳德,米斯特拉尔,谷歌PaLM,和Grok-in对最常见的关于肾结石(KS)的问题的回应。
    方法:Google趋势促进了与KS相关的相关术语的识别。每个AI聊天机器人都提供了一个由25个常用搜索短语组成的唯一序列作为输入。使用DISCERN评估反应,可打印材料的患者教育材料评估工具(PEMAT-P),Flesch-Kincaid等级(FKGL),和Flesch-Kincaid阅读轻松(FKRE)标准。
    结果:搜索频率最高的三个术语是“肾结石”,“肾结石疼痛,“和”肾痛。\"尼泊尔,印度,特立尼达和多巴哥是在KS中进行搜索最多的国家。人工智能聊天机器人都没有达到必要的可理解性水平。Grok表现出最高的FKRE和FKGL评级(p=0.001),而克劳德的DISCERN得分优于其他聊天机器人(p=0.001)。GPT-4中PEMAT-P的可理解性最低,Claude的可操作性最高(p=0.001)。
    结论:GPT-4具有五个聊天机器人中最复杂的语言结构,使其最难以阅读和理解,而Grok是最简单的.克劳德拥有最好的KS文本质量。Chatbot技术可以改善医疗保健材料,使其更容易掌握。
    Objective: To evaluate and compare the quality and comprehensibility of answers produced by five distinct artificial intelligence (AI) chatbots-GPT-4, Claude, Mistral, Google PaLM, and Grok-in response to the most frequently searched questions about kidney stones (KS). Materials and Methods: Google Trends facilitated the identification of pertinent terms related to KS. Each AI chatbot was provided with a unique sequence of 25 commonly searched phrases as input. The responses were assessed using DISCERN, the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P), the Flesch-Kincaid Grade Level (FKGL), and the Flesch-Kincaid Reading Ease (FKRE) criteria. Results: The three most frequently searched terms were \"stone in kidney,\" \"kidney stone pain,\" and \"kidney pain.\" Nepal, India, and Trinidad and Tobago were the countries that performed the most searches in KS. None of the AI chatbots attained the requisite level of comprehensibility. Grok demonstrated the highest FKRE (55.6 ± 7.1) and lowest FKGL (10.0 ± 1.1) ratings (p = 0.001), whereas Claude outperformed the other chatbots in its DISCERN scores (47.6 ± 1.2) (p = 0.001). PEMAT-P understandability was the lowest in GPT-4 (53.2 ± 2.0), and actionability was the highest in Claude (61.8 ± 3.5) (p = 0.001). Conclusion: GPT-4 had the most complex language structure of the five chatbots, making it the most difficult to read and comprehend, whereas Grok was the simplest. Claude had the best KS text quality. Chatbot technology can improve healthcare material and make it easier to grasp.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    大型语言模型(LLM)是基于变压器的神经网络,可以对问题和指令提供类似人类的响应。LLM可以生成教育材料,总结文本,从自由文本中提取结构化数据,创建报告,写程序,并可能在注销时提供帮助。LLM与视觉模型相结合可以帮助解释组织病理学图像。LLM在改变病理学实践和教育方面具有巨大的潜力,但是这些模型并非万无一失,因此,任何人工智能生成的内容都必须使用信誉良好的来源进行验证。必须谨慎对待这些模型如何融入临床实践,因为这些模型会产生幻觉和不正确的结果,对人工智能的过度依赖可能会导致去技能和自动化偏见。这篇综述论文提供了LLM的简要历史,并重点介绍了LLM在病理学领域的几个用例。
    Large language models (LLMs) are transformer-based neural networks that can provide human-like responses to questions and instructions. LLMs can generate educational material, summarize text, extract structured data from free text, create reports, write programs, and potentially assist in case sign-out. LLMs combined with vision models can assist in interpreting histopathology images. LLMs have immense potential in transforming pathology practice and education, but these models are not infallible, so any artificial intelligence generated content must be verified with reputable sources. Caution must be exercised on how these models are integrated into clinical practice, as these models can produce hallucinations and incorrect results, and an over-reliance on artificial intelligence may lead to de-skilling and automation bias. This review paper provides a brief history of LLMs and highlights several use cases for LLMs in the field of pathology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:卫生保健相关感染(HAI)监测对于卫生保健机构的安全至关重要。它有助于识别感染危险因素,加强患者安全和质量改进。然而,HAI监控很复杂,需要专业知识和资源。这项研究调查了人工智能(AI)的使用,特别是生成的大型语言模型,改善HAI监控。
    方法:我们评估了2个AI代理,OpenAI的chatGPT+(GPT-4)和基于混合8×7b的局部模型,从6种国家卫生保健安全网络培训方案中识别中线相关血流感染(CLABSI)和导管相关尿路感染(CAUTI)的能力。分析了这些场景的复杂性,和回应与专家意见相匹配。
    结果:在给出明确提示的情况下,两种AI模型都能在所有场景中准确识别CLABSI和CAUTI。挑战出现了模棱两可的提示,包括阿拉伯数字日期,缩写,和特殊字符,在重复测试中偶尔会导致不准确。
    结论:该研究表明AI在准确识别CLABSI和CAUTI等HAIs方面具有潜力。清除,具体提示对于可靠的AI响应至关重要,强调在人工智能辅助的HAI监测中需要人类监督。
    结论:AI在加强HAI监测方面显示出希望,潜在的精简任务,并释放医护人员进行以患者为中心的活动。有效的AI使用需要用户教育和持续的AI模型改进。
    Health care-associated infection (HAI) surveillance is vital for safety in health care settings. It helps identify infection risk factors, enhancing patient safety and quality improvement. However, HAI surveillance is complex, demanding specialized knowledge and resources. This study investigates the use of artificial intelligence (AI), particularly generative large language models, to improve HAI surveillance.
    We assessed 2 AI agents, OpenAI\'s chatGPT plus (GPT-4) and a Mixtral 8×7b-based local model, for their ability to identify Central Line-Associated Bloodstream Infection (CLABSI) and Catheter-Associated Urinary Tract Infection (CAUTI) from 6 National Health Care Safety Network training scenarios. The complexity of these scenarios was analyzed, and responses were matched against expert opinions.
    Both AI models accurately identified CLABSI and CAUTI in all scenarios when given clear prompts. Challenges appeared with ambiguous prompts including Arabic numeral dates, abbreviations, and special characters, causing occasional inaccuracies in repeated tests.
    The study demonstrates AI\'s potential in accurately identifying HAIs like CLABSI and CAUTI. Clear, specific prompts are crucial for reliable AI responses, highlighting the need for human oversight in AI-assisted HAI surveillance.
    AI shows promise in enhancing HAI surveillance, potentially streamlining tasks, and freeing health care staff for patient-focused activities. Effective AI use requires user education and ongoing AI model refinement.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Despite promising results obtained in the early diagnosis of several pathologies, breath analysis still remains an unused technique in clinical practice due to the lack of breath sampling standardized procedures able to guarantee a good repeatability and comparability of results. The most diffuse on an international scale breath sampling method uses polymeric bags, but, recently, devices named Mistral and ReCIVA, able to directly concentrate volatile organic compounds (VOCs) onto sorbent tubes, have been developed and launched on the market. In order to explore performances of these new automatic devices with respect to sampling in the polymeric bag and to study the differences in VOCs profile when whole or alveolar breath is collected and when pulmonary wash out with clean air is done, a tailored experimental design was developed. Three different breath sampling approaches were compared: (a) whole breath sampling by means of Tedlar bags, (b) the end-tidal breath collection using the Mistral sampler, and (c) the simultaneous collection of the whole and alveolar breath by using the ReCIVA. The obtained results showed that alveolar fraction of breath was relatively less affected by ambient air (AA) contaminants (p-values equal to 0.04 for Mistral and 0.002 for ReCIVA Low) with respect to whole breath (p-values equal to 0.97 for ReCIVA Whole). Compared to Tedlar bags, coherent results were obtained by using Mistral while lower VOCs levels were detected for samples (both breath and AA) collected by ReCIVA, likely due to uncorrected and fluctuating flow rates applied by this device. Finally, the analysis of all data also including data obtained by explorative analysis of the unique lung cancer (LC) breath sample showed that a clean air supply might determine a further confounding factor in breath analysis considering that lung wash-out is species-dependent.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Several documents and guidelines provide recommendations for effective management of COPD patients. However, there is often a significant imbalance between recommended treatment of COPD patients and the actual care provided both in primary care and specialty setting. This imbalance could result in a significant negative impact on patients\' health status and quality of life, leading to increased hospitalisations and health resource utilisation in COPD patients METHODS: MISTRAL was an observational, longitudinal, prospective cohort study, designed to assess the overall pharmacological approach of COPD in routine clinical practice in Italy. Eligible patients were divided into two cohorts based on their exacerbation history in the year prior to the enrolment, frequent exacerbators (FEs; ≥2 exacerbations), and non-frequent exacerbators (NFEs; ≤1 exacerbation). The primary objective was to assess adherence to Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2011 treatment recommendations in FEs and NFEs at baseline and follow-up visits RESULTS: Of the 1489 enrolled patients, 1468 (98.6%; FEs, 526; NFEs, 942) were considered evaluable for analyses. At baseline, 57.8% of patients were treated according to GOLD 2011 recommendations; a greater proportion of FEs were treated according to GOLD recommendations, compared with NFEs patients at baseline (77.1% versus 46.7%; P < 0.0001), and all study visits. At baseline, GOLD group D patients were the most adherent (81.2%) to treatment recommendations, while group A patients were the least adherent (30.3%) at baseline, attributed mainly to overuse of inhaled corticosteroids in less severe GOLD groups. Triple therapy with long-acting muscarinic antagonist (LAMA) + long-acting β2-agonist/inhaled corticosteroid (LABA/ICS) was the most frequent prescribed treatment at all study visits, irrespective of patient\'s exacerbation history. Changes in treatment were more frequent in FEs versus NFEs CONCLUSIONS: The Mistral study reports a scarce adherence to the GOLD 2011 treatment recommendations in routine clinical practice in Italy. The adherence was particularly low in less severe, non-frequent exacerbating patients mostly for ICS overuse, and was higher in high-risk, frequent exacerbating COPD patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号