images

图像
  • 文章类型: Journal Article
    本文使用批判性语篇分析来调查人工智能(AI)生成的老年护理护士图像,并考虑观点和看法如何影响护士的招聘和保留。这篇文章展示了老年护理护理的重新情境化,产生隐藏的意识形态,包括允许歧视和剥削的有害陈规定型观念。有人认为,这可能意味着护士在老年护理中需要较少的临床技能,降低在这一领域工作的价值。人工智能依赖于现有的数据集,因此代表了现有的刻板印象和偏见。话语分析突出了可能进一步影响护理招聘和保留的关键问题,并主张加强道德考量,包括在数据验证中使用专家,老年护理服务和护士的描绘方式和价值。
    This article uses critical discourse analysis to investigate artificial intelligence (AI) generated images of aged care nurses and considers how perspectives and perceptions impact upon the recruitment and retention of nurses. The article demonstrates a recontextualization of aged care nursing, giving rise to hidden ideologies including harmful stereotypes which allow for discrimination and exploitation. It is argued that this may imply that nurses require fewer clinical skills in aged care, diminishing the value of working in this area. AI relies on existing data sets, and thus represent existing stereotypes and biases. The discourse analysis has highlighted key issues which may further impact upon nursing recruitment and retention, and advocates for stronger ethical consideration, including the use of experts in data validation, for the way that aged care services and nurses are depicted and thus valued.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:目前正在研究基于超声的放射组学特征,以借助机器学习来区分良性和恶性乳腺病变。平均回声比已用于诊断恶性乳腺病变。然而,灰度强度直方图值作为使用机器学习算法检测恶性乳腺病变的单一影像组学特征尚未被探索。
    目的:本研究旨在评估简单的卷积神经网络在使用病变的灰度强度值对良性和恶性乳腺病变进行分类中的实用性。
    方法:收集200个超声乳腺病变的开放式在线数据集,并在病变上绘制感兴趣的区域。提取病变的灰度强度值。创建包含值的输入文件和由乳腺病变诊断组成的输出文件。使用这些文件对卷积神经网络进行训练,并在整个数据集上进行测试。
    结果:经训练的卷积神经网络的准确率为94.5%,精度为94%。敏感性和特异性分别为94.9%和94.1%,分别。
    结论:简单的神经网络,便宜且易于使用,可应用于诊断恶性乳腺病变与灰度强度值获得的超声图像在低资源设置与最少的人员。
    BACKGROUND: Ultrasound-based radiomic features to differentiate between benign and malignant breast lesions with the help of machine learning is currently being researched. The mean echogenicity ratio has been used for the diagnosis of malignant breast lesions. However, gray scale intensity histogram values as a single radiomic feature for the detection of malignant breast lesions using machine learning algorithms have not been explored yet.
    OBJECTIVE: This study aims to assess the utility of a simple convolutional neural network in classifying benign and malignant breast lesions using gray scale intensity values of the lesion.
    METHODS: An open-access online data set of 200 ultrasonogram breast lesions were collected, and regions of interest were drawn over the lesions. The gray scale intensity values of the lesions were extracted. An input file containing the values and an output file consisting of the breast lesions\' diagnoses were created. The convolutional neural network was trained using the files and tested on the whole data set.
    RESULTS: The trained convolutional neural network had an accuracy of 94.5% and a precision of 94%. The sensitivity and specificity were 94.9% and 94.1%, respectively.
    CONCLUSIONS: Simple neural networks, which are cheap and easy to use, can be applied to diagnose malignant breast lesions with gray scale intensity values obtained from ultrasonogram images in low-resource settings with minimal personnel.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    有关健康牙齿组织的Hounsfield值范围的信息可能成为评估牙齿健康的附加工具,可以使用,在其他数据中,用于后续机器学习。
    我们研究的目的是确定以Hounsfield单位(HU)为单位的牙齿组织密度。
    总样本包括研究时年龄在10-11岁的36名健康儿童(n=21,58%的女孩和n=15,42%的男孩)。分析了320颗牙齿组织的密度。数据表示为均值和SDs。使用Student(1尾)t检验确定显著性。统计学意义设置为P<0.05。
    分析了320颗牙齿组织的密度:72颗(22.5%)第一恒磨牙,72个(22.5%)永久性中央切牙,27颗(8.4%)第二乳磨牙,40(12.5%)第二前磨牙的牙胚,37(11.6%)第二前磨牙,9(2.8%)第二恒磨牙,第二恒磨牙的牙胚为63个(19.7%)。对数据的分析表明,儿童健康牙齿的组织具有不同的密度范围:牙釉质,从平均2954.69(SD223.77)HU到平均2071.00(SD222.86)HU;牙本质,从平均1899.23(SD145.94)HU到平均1323.10(SD201.67)HU;和纸浆,从平均420.29(SD196.47)HU到平均183.63(SD97.59)HU。下颌骨和上颌骨中永久性中央切牙的组织(牙釉质和牙本质)的平均密度最高。没有可靠地确定有关牙齿组织密度的性别差异。
    对牙齿组织的Hounsfield值的评估可用作评估其密度的客观方法。如果确定釉质的密度,牙本质,和牙髓不符合健康牙齿组织的值范围,那么它可能表明病理。
    UNASSIGNED: Information about the range of Hounsfield values for healthy teeth tissues could become an additional tool in assessing dental health and could be used, among other data, for subsequent machine learning.
    UNASSIGNED: The purpose of our study was to determine dental tissue densities in Hounsfield units (HU).
    UNASSIGNED: The total sample included 36 healthy children (n=21, 58% girls and n=15, 42% boys) aged 10-11 years at the time of the study. The densities of 320 teeth tissues were analyzed. Data were expressed as means and SDs. The significance was determined using the Student (1-tailed) t test. The statistical significance was set at P<.05.
    UNASSIGNED: The densities of 320 teeth tissues were analyzed: 72 (22.5%) first permanent molars, 72 (22.5%) permanent central incisors, 27 (8.4%) second primary molars, 40 (12.5%) tooth germs of second premolars, 37 (11.6%) second premolars, 9 (2.8%) second permanent molars, and 63 (19.7%) tooth germs of second permanent molars. The analysis of the data showed that tissues of healthy teeth in children have different density ranges: enamel, from mean 2954.69 (SD 223.77) HU to mean 2071.00 (SD 222.86) HU; dentin, from mean 1899.23 (SD 145.94) HU to mean 1323.10 (SD 201.67) HU; and pulp, from mean 420.29 (SD 196.47) HU to mean 183.63 (SD 97.59) HU. The tissues (enamel and dentin) of permanent central incisors in the mandible and maxilla had the highest mean densities. No gender differences concerning the density of dental tissues were reliably identified.
    UNASSIGNED: The evaluation of Hounsfield values for dental tissues can be used as an objective method for assessing their densities. If the determined densities of the enamel, dentin, and pulp of the tooth do not correspond to the range of values for healthy tooth tissues, then it may indicate a pathology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:人工智能(AI)的集成,特别是深度学习模型,改变了医疗技术的格局,特别是在使用成像和生理数据的诊断领域。在耳鼻喉科,AI在中耳疾病的图像分类中显示出希望。然而,现有的模型通常缺乏患者特定的数据和临床背景,限制其普遍适用性。GPT-4Vision(GPT-4V)的出现使得多模态诊断方法成为可能,将语言处理与图像分析相结合。
    目的:在本研究中,我们通过整合患者特异性数据和耳镜下鼓膜图像,研究了GPT-4V在诊断中耳疾病中的有效性.
    方法:本研究的设计分为两个阶段:(1)建立具有适当提示的模型和(2)验证最佳提示模型对图像进行分类的能力。总的来说,305个中耳疾病的耳镜图像(急性中耳炎,中耳胆脂瘤,慢性中耳炎,和渗出性中耳炎)来自2010年4月至2023年12月期间访问新州大学或济池医科大学的患者。使用提示和患者数据建立优化的GPT-4V设置,并使用最佳提示创建的模型来验证GPT-4V在190张图像上的诊断准确性。为了比较GPT-4V与医生的诊断准确性,30名临床医生完成了由190张图像组成的基于网络的问卷。
    结果:多模态人工智能方法实现了82.1%的准确率,优于认证儿科医生的70.6%,但落后于耳鼻喉科医生的95%以上。该模型对急性中耳炎的疾病特异性准确率为89.2%,76.5%为慢性中耳炎,79.3%为中耳胆脂瘤,渗出性中耳炎占85.7%,这突出了对疾病特异性优化的需求。与医生的比较显示了有希望的结果,提示GPT-4V增强临床决策的潜力。
    结论:尽管有其优势,必须解决数据隐私和道德考虑等挑战。总的来说,这项研究强调了多模式AI在提高诊断准确性和改善耳鼻喉科患者护理方面的潜力.需要进一步的研究以在不同的临床环境中优化和验证这种方法。
    BACKGROUND: The integration of artificial intelligence (AI), particularly deep learning models, has transformed the landscape of medical technology, especially in the field of diagnosis using imaging and physiological data. In otolaryngology, AI has shown promise in image classification for middle ear diseases. However, existing models often lack patient-specific data and clinical context, limiting their universal applicability. The emergence of GPT-4 Vision (GPT-4V) has enabled a multimodal diagnostic approach, integrating language processing with image analysis.
    OBJECTIVE: In this study, we investigated the effectiveness of GPT-4V in diagnosing middle ear diseases by integrating patient-specific data with otoscopic images of the tympanic membrane.
    METHODS: The design of this study was divided into two phases: (1) establishing a model with appropriate prompts and (2) validating the ability of the optimal prompt model to classify images. In total, 305 otoscopic images of 4 middle ear diseases (acute otitis media, middle ear cholesteatoma, chronic otitis media, and otitis media with effusion) were obtained from patients who visited Shinshu University or Jichi Medical University between April 2010 and December 2023. The optimized GPT-4V settings were established using prompts and patients\' data, and the model created with the optimal prompt was used to verify the diagnostic accuracy of GPT-4V on 190 images. To compare the diagnostic accuracy of GPT-4V with that of physicians, 30 clinicians completed a web-based questionnaire consisting of 190 images.
    RESULTS: The multimodal AI approach achieved an accuracy of 82.1%, which is superior to that of certified pediatricians at 70.6%, but trailing behind that of otolaryngologists at more than 95%. The model\'s disease-specific accuracy rates were 89.2% for acute otitis media, 76.5% for chronic otitis media, 79.3% for middle ear cholesteatoma, and 85.7% for otitis media with effusion, which highlights the need for disease-specific optimization. Comparisons with physicians revealed promising results, suggesting the potential of GPT-4V to augment clinical decision-making.
    CONCLUSIONS: Despite its advantages, challenges such as data privacy and ethical considerations must be addressed. Overall, this study underscores the potential of multimodal AI for enhancing diagnostic accuracy and improving patient care in otolaryngology. Further research is warranted to optimize and validate this approach in diverse clinical settings.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    婴儿突然意外死亡(SUID)仍然是婴儿死亡的主要原因;因此,了解父母在家婴儿睡眠的做法至关重要。由于社交媒体分析产生了宝贵的患者观点,通过Facebook母亲小组在安全睡眠建议的背景下理解睡眠实践对政策制定者很有帮助,卫生保健提供者,和研究人员。
    这项研究旨在识别母亲分享的在线讨论SUID和安全睡眠的照片,并根据美国儿科学会(AAP)评估其与婴儿睡眠指南的一致性。我们假设这些照片与基于先前研究的指南以及在床上意外窒息和勒死的发生率不一致。
    数据是在2019年5月从Facebook母亲小组中提取的。在试用了各种搜索词后,在选定的Facebook群组上搜索“SIDS”一词导致了关于SUID和安全睡眠的最相关讨论。产生的数据,包括512位母亲中的20个帖子和912条评论,进行了定性的描述性内容分析。在完成提取和后续分析时,在讨论中确定了24张共享的个人照片。在照片中,14与婴儿睡眠环境有关。然后由2个独立的审阅者根据AAP标准评估婴儿睡眠环境的照片与安全睡眠指南的一致性。
    在与婴儿睡眠环境有关的共享照片中,86%(12/14)与AAP安全睡眠指南不一致。具体的不一致包括容易睡觉,睡眠环境中的异物,和使用婴儿睡眠装置。还确定了婴儿监测设备的使用。
    这项研究是独一无二的,因为照片来自家庭环境,在SUID和安全睡眠的背景下,并且是在没有研究人员干扰的情况下获得的。尽管研究有局限性,容易睡觉的共性,外来物体,以及婴儿睡眠和监测设备的使用(即,关于AAP安全睡眠指南的总体不一致)为未来关于父母进行婴儿安全睡眠障碍的调查奠定了基础,并对政策制定者产生了影响。临床医生,和研究人员。
    UNASSIGNED: Sudden unexpected infant death (SUID) remains a leading cause of infant mortality; therefore, understanding parental practices of infant sleep at home is essential. Since social media analyses yield invaluable patient perspectives, understanding sleep practices in the context of safe sleep recommendations via a Facebook mothers\' group is instrumental for policy makers, health care providers, and researchers.
    UNASSIGNED: This study aimed to identify photos shared by mothers discussing SUID and safe sleep online and assess their consistency with infant sleep guidelines per the American Academy of Pediatrics (AAP). We hypothesized the photos would not be consistent with guidelines based on prior research and increasing rates of accidental suffocation and strangulation in bed.
    UNASSIGNED: Data were extracted from a Facebook mothers\' group in May 2019. After trialing various search terms, searching for the term \"SIDS\" on the selected Facebook group resulted in the most relevant discussions on SUID and safe sleep. The resulting data, including 20 posts and 912 comments among 512 mothers, were extracted and underwent qualitative descriptive content analysis. In completing the extraction and subsequent analysis, 24 shared personal photos were identified among the discussions. Of the photos, 14 pertained to the infant sleep environment. Photos of the infant sleep environment were then assessed for consistency with safe sleep guidelines per the AAP standards by 2 separate reviewers.
    UNASSIGNED: Of the shared photos relating to the infant sleep environment, 86% (12/14) were not consistent with AAP safe sleep guidelines. Specific inconsistencies included prone sleeping, foreign objects in the sleeping environment, and use of infant sleeping devices. Use of infant monitoring devices was also identified.
    UNASSIGNED: This study is unique because the photos originated from the home setting, were in the context of SUID and safe sleep, and were obtained without researcher interference. Despite study limitations, the commonality of prone sleeping, foreign objects, and the use of both infant sleep and monitoring devices (ie, overall inconsistency regarding AAP safe sleep guidelines) sets the stage for future investigation regarding parental barriers to practicing safe infant sleep and has implications for policy makers, clinicians, and researchers.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文档介绍了作为WEBDATAOPP项目一部分进行的一项研究的协议,由H2020计划资助。该研究旨在通过网络调查调查图像收集的不同方面。要做到这一点,我们在西班牙的一个选择加入在线小组中实施了一项移动网络调查。调查有各种各样的问题,其中一些是关于参与者在他们的主要住所拥有的书籍。与书籍相关的问题以三种不同的方式被问到:定期调查问题显示了视觉示例,说明不同数量的书籍如何根据其厚度在74厘米宽的架子上放置,没有视觉例子的常规调查问题,以及要求参与者在家中发送书籍照片的问题。这份报告解释了这项研究是如何设计和进行的。它涵盖了重要的方面,如实验设计,使用的问卷,参与者的特点,伦理考虑,以及传播结果的计划。
    本文档介绍了我们研究的协议,要求受访者提供有关他们家中书籍的信息。此信息是通过常规类型的问题征求的(即,键入答案或选择一个答案类别),和/或通过要求受访者拍摄和发送书籍的照片。本研究具有方法论和实质性目标。前者涉及调查受访者的偏好,问题的评估,参与水平,合规,和数据质量。后者侧重于探索书籍数量对儿童学业成绩的影响,并研究可能影响这些关系的其他因素。我们进行了一项移动网络调查,将受访者分为四组:•选择:受访者可以选择他们喜欢的回答方法。•Text-TextPlus:受访者首先回答常规问题,后来收到了不同数量的书籍如何帮助受访者提供准确答案的插图。•TextPlus-图像:受访者用插图回答了常规问题,然后在家中提交了书籍的照片。•图像文本:受访者分享了书籍的照片,然后回答了常规问题。受访者被要求评估他们各自的回答方法。问卷有多达65个问题,涵盖各个维度,包括受访者的社会人口特征,儿童的学习成绩,与扫盲有关的活动,和相机的使用。数据是使用西班牙的Netquest选择在线面板收集的。WebdataVisual工具用于捕获和共享照片。目标人群包括与他们生活在一起的儿童的父母,第三,或小学五年级。样本量为1,202例。我们希望这项研究能够通过网络调查提供有关视觉数据收集的有价值的见解。Further,当通过不同的方法收集这些数据时,我们希望更好地了解受访者在家中拥有的书籍数据。
    This document presents the protocol of a study conducted as a part of the WEB DATA OPP project, which is funded by the H2020 program. The study aimed to investigate different aspects of the collection of images through web surveys. To do this, we implemented a mobile web survey in an opt-in online panel in Spain. The survey had various questions, some of which were about the books that the participants have at their main residence. The questions related to books were asked in three different ways: regular survey questions showing visual examples of how different numbers of books fit in a 74 centimetre wide shelf depending on their thickness, regular survey questions without the visual examples, and questions where participants were asked to send photos of the books at their home. This report explains how the study was designed and conducted. It covers important aspects such as the experimental design, the questionnaire used, the characteristics of the participants, ethical considerations, and plans for disseminating the results.
    This document presents the protocol of our study asking respondents for information about the books they have at home. This information was solicited through conventional types of questions ( i.e., typing in answers or choosing one answer category), and/or through asking respondents to take and send photos of the books. This study has methodological and substantive objectives. The former involves investigating respondents’ preference, evaluation of the questions, participation levels, compliance, and data quality. The latter focuses on exploring the impact of the number of books on the academic achievement of children and examining other factors that might influence these relations. We conducted a mobile web survey, assigning respondents to four groups:    •   Choice: Respondents could choose their preferred answering method.    •   Text-TextPlus: Respondents answered conventional questions first, and later received illustrations of how different numbers of books looked like to help respondents provide accurate answers.    •   TextPlus-Images: Respondents answered conventional questions with the illustrations and then submitted photos of the books at home.    •   Images-Text: Respondents shared photos of the books and then answered the conventional questions. Respondents were asked to evaluate their respective answering methods. The questionnaire had up to 65 questions covering various dimensions, including respondents’ sociodemographic characteristics, children\'s academic performance, literacy-related activities, and camera usage. Data were collected using the Netquest opt-in online panel in Spain. The tool WebdataVisual was used to capture and share the photos. The target population included parents of children living with them and who attended the first, third, or fifth year of primary school. The sample size was 1,202 cases. We expect this study to provide valuable insights regarding visual data collection through web surveys. Further, we expect to gain a better understanding of the data on the books respondents have at home when such data are collected through different methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Randomized Controlled Trial
    准确评估个人的饮食对于管理个人营养和研究饮食对健康的影响至关重要。尽管它很重要,可用于饮食评估的工具要么过于不精确,贵,或繁重的临床或研究使用。基于图像的方法为提高饮食评估的可靠性和可及性提供了潜在的新工具。虽然很有希望,基于图像的方法对粘附敏感,因为图像无法从已经消耗的食物中捕获。可以通过经由文本消息的适当定时的提示来改善对基于图像的方法的坚持。
    本研究旨在定量检查提示时间对坚持基于图像的饮食记录的影响,并定性地探索参与者的饮食评估体验,以便为设计新颖的基于图像的饮食评估工具提供信息。
    这项研究使用了随机交叉设计,以检查3种提示设置对基于图像的饮食记录中捕获的图像数量的个体内影响。提示设置是控制,没有发送提示;标准,提示是在上午7:15发送的,11:15AM,为每位参与者提供5:15PM;并量身定制,及时的时间安排是为每个参与者的习惯性用餐时间量身定做的。参与者在基线时完成了基于文本的饮食记录,以确定定制提示的时间。参与者被随机分配到6个研究序列中的1个,每个都有3个提示设置的唯一顺序,每个3天的基于图像的饮食记录由至少7天的清除期分开。定性部分包括半结构化访谈和问卷调查,探索饮食评估的经验。
    共招募了37人,和30名参与者(11名男性,19名女性;平均年龄30岁,标准差10.8岁),完成所有基于图像的饮食记录。与对照相比,标准设置中的图像速率每天增加0.83个图像(P=.23),与对照相比,定制设置中的图像速率每天增加1.78个图像(P≤.001)。我们发现13/21(62%)的参与者更喜欢使用基于图像的饮食记录,而不是基于文本的饮食记录,但报告了每种方法的特定方法挑战。特别是在用餐后无法通过图像记录。
    定制提示可提高对基于图像的饮食评估的依从性。未来基于图像的饮食评估工具应使用量身定制的提示,并提供基于图像和书面输入选项,以提高记录的完整性。
    UNASSIGNED: Accurately assessing an individual\'s diet is vital in the management of personal nutrition and in the study of the effect of diet on health. Despite its importance, the tools available for dietary assessment remain either too imprecise, expensive, or burdensome for clinical or research use. Image-based methods offer a potential new tool to improve the reliability and accessibility of dietary assessment. Though promising, image-based methods are sensitive to adherence, as images cannot be captured from meals that have already been consumed. Adherence to image-based methods may be improved with appropriately timed prompting via text message.
    UNASSIGNED: This study aimed to quantitatively examine the effect of prompt timing on adherence to an image-based dietary record and qualitatively explore the participant experience of dietary assessment in order to inform the design of a novel image-based dietary assessment tool.
    UNASSIGNED: This study used a randomized crossover design to examine the intraindividual effect of 3 prompt settings on the number of images captured in an image-based dietary record. The prompt settings were control, where no prompts were sent; standard, where prompts were sent at 7:15 AM, 11:15 AM, and 5:15 PM for every participant; and tailored, where prompt timing was tailored to habitual meal times for each participant. Participants completed a text-based dietary record at baseline to determine the timing of tailored prompts. Participants were randomized to 1 of 6 study sequences, each with a unique order of the 3 prompt settings, with each 3-day image-based dietary record separated by a washout period of at least 7 days. The qualitative component comprised semistructured interviews and questionnaires exploring the experience of dietary assessment.
    UNASSIGNED: A total of 37 people were recruited, and 30 participants (11 male, 19 female; mean age 30, SD 10.8 years), completed all image-based dietary records. The image rate increased by 0.83 images per day in the standard setting compared to control (P=.23) and increased by 1.78 images per day in the tailored setting compared to control (P≤.001). We found that 13/21 (62%) of participants preferred to use the image-based dietary record versus the text-based dietary record but reported method-specific challenges with each method, particularly the inability to record via an image after a meal had been consumed.
    UNASSIGNED: Tailored prompting improves adherence to image-based dietary assessment. Future image-based dietary assessment tools should use tailored prompting and offer both image-based and written input options to improve record completeness.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:在不断发展的医疗保健领域中,多模态生成人工智能(AI)系统,例如带视力的ChatGPT-4(ChatGPT-4V),代表着一个重大的进步,因为它们将视觉数据与文本数据集成在一起。这种集成有可能通过提供更全面的分析功能来彻底改变临床诊断。然而,使用图像数据增强ChatGPT-4对诊断准确性的影响尚不清楚.
    目的:本研究旨在评估添加图像数据对ChatGPT-4诊断准确性的影响,并提供有关图像数据集成如何提高医学诊断中多模态AI的准确性的见解。具体来说,这项研究试图比较ChatGPT-4V之间的诊断准确性,处理了文本和图像数据,及其对应的,ChatGPT-4,它只使用文本数据。
    方法:我们确定了2022年1月至2023年3月在《美国病例报告杂志》上发表的557例病例报告。排除非诊断性病例后,儿科,缺乏图像数据,我们纳入了363例病例描述及其最终诊断和相关图像.我们比较了ChatGPT-4V和ChatGPT-4无视力的诊断准确性,基于它们在鉴别诊断列表中包括最终诊断的能力。两名独立的医生评估了他们的准确性,第三个解决任何差异,确保严格和客观的分析。
    结果:将图像数据整合到ChatGPT-4V中并没有显着提高诊断准确性,显示最终诊断以85.1%(n=309)的比率被列入前10名鉴别诊断列表,与纯文本版本的87.9%(n=319)的比率相当(P=0.33)。值得注意的是,ChatGPT-4V在正确识别最高诊断方面的表现较差,44.4%(n=161),与纯文本版本的55.9%(n=203)相比(P=0.002,χ2检验)。此外,ChatGPT-4的自我报告显示,在超过一半的病例中,图像数据占制定鉴别诊断列表的30%。
    结论:我们的研究结果表明,目前,ChatGPT-4V主要依赖于文本数据,限制了其充分利用视觉信息的诊断潜力的能力。这项研究强调需要进一步开发多模态生成AI系统,以有效地整合和使用临床图像数据。通过改进的多模式数据集成来增强此类AI系统的诊断性能,可以通过提供更准确和全面的诊断见解来大大有利于患者护理。未来的研究应该专注于克服这些限制,为先进人工智能在医学中的实际应用铺平了道路。
    BACKGROUND: In the evolving field of health care, multimodal generative artificial intelligence (AI) systems, such as ChatGPT-4 with vision (ChatGPT-4V), represent a significant advancement, as they integrate visual data with text data. This integration has the potential to revolutionize clinical diagnostics by offering more comprehensive analysis capabilities. However, the impact on diagnostic accuracy of using image data to augment ChatGPT-4 remains unclear.
    OBJECTIVE: This study aims to assess the impact of adding image data on ChatGPT-4\'s diagnostic accuracy and provide insights into how image data integration can enhance the accuracy of multimodal AI in medical diagnostics. Specifically, this study endeavored to compare the diagnostic accuracy between ChatGPT-4V, which processed both text and image data, and its counterpart, ChatGPT-4, which only uses text data.
    METHODS: We identified a total of 557 case reports published in the American Journal of Case Reports from January 2022 to March 2023. After excluding cases that were nondiagnostic, pediatric, and lacking image data, we included 363 case descriptions with their final diagnoses and associated images. We compared the diagnostic accuracy of ChatGPT-4V and ChatGPT-4 without vision based on their ability to include the final diagnoses within differential diagnosis lists. Two independent physicians evaluated their accuracy, with a third resolving any discrepancies, ensuring a rigorous and objective analysis.
    RESULTS: The integration of image data into ChatGPT-4V did not significantly enhance diagnostic accuracy, showing that final diagnoses were included in the top 10 differential diagnosis lists at a rate of 85.1% (n=309), comparable to the rate of 87.9% (n=319) for the text-only version (P=.33). Notably, ChatGPT-4V\'s performance in correctly identifying the top diagnosis was inferior, at 44.4% (n=161), compared with 55.9% (n=203) for the text-only version (P=.002, χ2 test). Additionally, ChatGPT-4\'s self-reports showed that image data accounted for 30% of the weight in developing the differential diagnosis lists in more than half of cases.
    CONCLUSIONS: Our findings reveal that currently, ChatGPT-4V predominantly relies on textual data, limiting its ability to fully use the diagnostic potential of visual information. This study underscores the need for further development of multimodal generative AI systems to effectively integrate and use clinical image data. Enhancing the diagnostic performance of such AI systems through improved multimodal data integration could significantly benefit patient care by providing more accurate and comprehensive diagnostic insights. Future research should focus on overcoming these limitations, paving the way for the practical application of advanced AI in medicine.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号