GPT - 4V （ ision ）在日本国家医学执照考试中的能力：评估研究。Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: Previous research applying large language models (LLMs) to medicine was focused on text-based information. Recently, multimodal variants of LLMs acquired the capability of recognizing images.
OBJECTIVE: We aim to evaluate the image recognition capability of generative pretrained transformer (GPT)-4V, a recent multimodal LLM developed by OpenAI, in the medical field by testing how visual information affects its performance to answer questions in the 117th Japanese National Medical Licensing Examination.
METHODS: We focused on 108 questions that had 1 or more images as part of a question and presented GPT-4V with the same questions under two conditions: (1) with both the question text and associated images and (2) with the question text only. We then compared the difference in accuracy between the 2 conditions using the exact McNemar test.
RESULTS: Among the 108 questions with images, GPT-4V\'s accuracy was 68% (73/108) when presented with images and 72% (78/108) when presented without images (P=.36). For the 2 question categories, clinical and general, the accuracies with and those without images were 71% (70/98) versus 78% (76/98; P=.21) and 30% (3/10) versus 20% (2/10; P≥.99), respectively.
CONCLUSIONS: The additional information from the images did not significantly improve the performance of GPT-4V in the Japanese National Medical Licensing Examination.

摘要：

背景：将大型语言模型（LLM）应用于医学的先前研究集中在基于文本的信息上。最近,LLM的多模态变体获得了识别图像的能力。
目的：我们旨在评估生成预训练变压器（GPT）-4V的图像识别能力，OpenAI最近开发的多模式LLM，在医疗领域，通过测试视觉信息如何影响其性能来回答第117次日本国家医疗执照考试中的问题。
方法：我们专注于108个问题，其中包含一个或多个图像作为问题的一部分，并在两种条件下将相同的问题呈现给GPT-4V：（1）同时包含问题文本和相关图像，以及（2）仅包含问题文本。然后，我们使用精确的McNemar测试比较了两种条件之间的准确性差异。
结果：在带有图像的108个问题中，GPT-4V的准确性是68％（73/108）时，呈现图像和72％（78/108）时，没有图像（P=.36）。对于2个问题类别，临床和一般，有和没有图像的准确度分别为71%(70/98)对78%(76/98;P=.21)和30%(3/10)对20%(2/10;P≥.99)，分别。
结论：来自图像的其他信息并未显着改善GPT-4V在日本国家医学执照考试中的性能。