GPT

GPT
  • 文章类型: Journal Article
    本研究评估了多模态大型语言模型(LLM)的诊断准确性,ChatGPT-4,使用具有基准数据集的彩色眼底照片(CFP)识别青光眼,无需事先训练或微调。
    使用可公开访问的视网膜眼底青光眼挑战“REFUGE”数据集进行分析。输入数据由整个400个图像测试集组成。任务涉及将眼底图像分类为“可能的青光眼”或“可能的非青光眼”。我们构建了一个混淆矩阵来可视化ChatGPT-4的预测结果,重点是二元分类的准确性(青光眼与非青光眼)。
    ChatGPT-4显示出90%的准确性,95%的置信区间(CI)为87.06%-92.94%。敏感度为50%(95%CI:34.51%-65.49%),而特异性为94.44%(95%CI:92.08%-96.81%)。精度记录为50%(95%CI:34.51%-65.49%),F1评分为0.50。
    ChatGPT-4在没有预先对CFP进行微调的情况下实现了相对较高的诊断准确性。考虑到专业医疗领域数据的稀缺性,包括眼科,使用先进的人工智能技术,如LLM,与其他形式的AI相比,可能需要更少的数据进行培训,并可能节省时间和财务资源。它还可能为开发创新工具以支持专业医疗服务铺平道路,特别是那些依赖于多模态数据进行诊断和随访的数据,不受资源限制。
    UNASSIGNED: This study evaluates the diagnostic accuracy of a multimodal large language model (LLM), ChatGPT-4, in recognizing glaucoma using color fundus photographs (CFPs) with a benchmark dataset and without prior training or fine tuning.
    UNASSIGNED: The publicly accessible Retinal Fundus Glaucoma Challenge \"REFUGE\" dataset was utilized for analyses. The input data consisted of the entire 400 image testing set. The task involved classifying fundus images into either \'Likely Glaucomatous\' or \'Likely Non-Glaucomatous\'. We constructed a confusion matrix to visualize the results of predictions from ChatGPT-4, focusing on accuracy of binary classifications (glaucoma vs non-glaucoma).
    UNASSIGNED: ChatGPT-4 demonstrated an accuracy of 90% with a 95% confidence interval (CI) of 87.06%-92.94%. The sensitivity was found to be 50% (95% CI: 34.51%-65.49%), while the specificity was 94.44% (95% CI: 92.08%-96.81%). The precision was recorded at 50% (95% CI: 34.51%-65.49%), and the F1 Score was 0.50.
    UNASSIGNED: ChatGPT-4 achieved relatively high diagnostic accuracy without prior fine tuning on CFPs. Considering the scarcity of data in specialized medical fields, including ophthalmology, the use of advanced AI techniques, such as LLMs, might require less data for training compared to other forms of AI with potential savings in time and financial resources. It may also pave the way for the development of innovative tools to support specialized medical care, particularly those dependent on multimodal data for diagnosis and follow-up, irrespective of resource constraints.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:大型语言模型(LLM),例如OpenAI的生成预训练转换器(GPT)和MetaAI的LLaMA(大型语言模型MetaAI),因其在化学信息学领域的潜力而日益受到认可。特别是在理解简化的分子输入线进入系统(SMILES),表示化学结构的标准方法。这些LLM还具有将SMILES字符串解码为向量表示的能力。
    方法:我们研究了GPT和LLaMA与SMILES上的预训练模型相比在下游任务上嵌入SMILES字符串的性能,重点研究了两个关键应用:分子性质预测和药物相互作用预测。
    结果:我们发现,使用LLaMA生成的SMILES嵌入在分子性质和DDI预测任务中都优于GPT。值得注意的是,基于LLaMA的SMILES嵌入在分子预测任务中显示出与SMILES上的预训练模型相当的结果,并且优于DDI预测任务的预训练模型。
    结论:LLM在生成SMILES嵌入方面的性能显示出进一步研究这些分子嵌入模型的巨大潜力。我们希望我们的研究弥合LLM和分子嵌入之间的差距,激发对分子表示领域LLM潜力的额外研究。GitHub:https://github.com/sshaghayghs/LLaMA-VS-GPT。
    OBJECTIVE: Large Language Models (LLMs) like Generative Pre-trained Transformer (GPT) from OpenAI and LLaMA (Large Language Model Meta AI) from Meta AI are increasingly recognized for their potential in the field of cheminformatics, particularly in understanding Simplified Molecular Input Line Entry System (SMILES), a standard method for representing chemical structures. These LLMs also have the ability to decode SMILES strings into vector representations.
    METHODS: We investigate the performance of GPT and LLaMA compared to pre-trained models on SMILES in embedding SMILES strings on downstream tasks, focusing on two key applications: molecular property prediction and drug-drug interaction prediction.
    RESULTS: We find that SMILES embeddings generated using LLaMA outperform those from GPT in both molecular property and DDI prediction tasks. Notably, LLaMA-based SMILES embeddings show results comparable to pre-trained models on SMILES in molecular prediction tasks and outperform the pre-trained models for the DDI prediction tasks.
    CONCLUSIONS: The performance of LLMs in generating SMILES embeddings shows great potential for further investigation of these models for molecular embedding. We hope our study bridges the gap between LLMs and molecular embedding, motivating additional research into the potential of LLMs in the molecular representation field. GitHub: https://github.com/sshaghayeghs/LLaMA-VS-GPT .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    随着自然语言处理(NLP)的快速发展,预训练语言模型(PLM),如BERT、Biobert,ChatGPT在各种医学NLP任务中显示出巨大的潜力。本文调查了将PLM应用于各种医学NLP任务的前沿成就。具体来说,我们首先简要介绍PLMS,概述PLMS在医学中的研究。接下来,我们对医学NLP中的任务类型进行分类和讨论,涵盖文本摘要,问答,机器翻译,情绪分析,命名实体识别,信息提取,医学教育,关系提取,和文本挖掘。对于每种类型的任务,我们首先提供基本概念的概述,主要方法,应用PLM的优势,应用PLM应用程序的基本步骤,用于培训和测试的数据集,以及任务评估的指标。随后,总结了最近的重要研究成果,分析他们的动机,优势与劣势,相似性与差异性,讨论潜在的限制。此外,我们通过比较被审查论文的引文数和发表论文的会议和期刊的声誉和影响来评估本文所审查研究的质量和影响力。通过这些指标,我们进一步确定了当前最关注的研究课题。最后,我们期待着未来的研究方向,包括增强模型的可靠性,可解释性,和公平,促进PLMs在临床实践中的应用。此外,本次调查还收集了一些模型代码和相关数据集的下载链接,这对于在医学中应用NLP技术的研究人员和寻求通过AI技术增强其专业知识和医疗保健服务的医疗专业人员来说是有价值的参考。
    With the rapid progress in Natural Language Processing (NLP), Pre-trained Language Models (PLM) such as BERT, BioBERT, and ChatGPT have shown great potential in various medical NLP tasks. This paper surveys the cutting-edge achievements in applying PLMs to various medical NLP tasks. Specifically, we first brief PLMS and outline the research of PLMs in medicine. Next, we categorise and discuss the types of tasks in medical NLP, covering text summarisation, question-answering, machine translation, sentiment analysis, named entity recognition, information extraction, medical education, relation extraction, and text mining. For each type of task, we first provide an overview of the basic concepts, the main methodologies, the advantages of applying PLMs, the basic steps of applying PLMs application, the datasets for training and testing, and the metrics for task evaluation. Subsequently, a summary of recent important research findings is presented, analysing their motivations, strengths vs weaknesses, similarities vs differences, and discussing potential limitations. Also, we assess the quality and influence of the research reviewed in this paper by comparing the citation count of the papers reviewed and the reputation and impact of the conferences and journals where they are published. Through these indicators, we further identify the most concerned research topics currently. Finally, we look forward to future research directions, including enhancing models\' reliability, explainability, and fairness, to promote the application of PLMs in clinical practice. In addition, this survey also collect some download links of some model codes and the relevant datasets, which are valuable references for researchers applying NLP techniques in medicine and medical professionals seeking to enhance their expertise and healthcare service through AI technology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    牛皮癣是一种免疫介导的皮肤病,影响全球约3%的人口。这种情况的正确管理需要评估体表面积(BSA)以及指甲和关节的参与。最近,自然语言处理(NLP)与电子医疗记录(EMR)的集成在推进疾病分类和研究方面显示出了希望。这项研究评估了商业AI平台ChatGPT-4的性能,在分析银屑病患者的非结构化EMR数据时,特别是在识别受影响的身体区域。
    Psoriasis is an immune-mediated skin disease affecting approximately 3% of the global population. Proper management of this condition necessitates the assessment of the Body Surface Area (BSA) and the involvement of nails and joints. Recently, the integration of Natural Language Processing (NLP) with Electronic Medical Records (EMRs) has shown promise in advancing disease classification and research. This study evaluates the performance of ChatGPT-4, a commercial AI platform, in analyzing unstructured EMR data of psoriasis patients, particularly in identifying affected body areas.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    大型语言模型(LLM)分析和响应自由书写文本的能力在精神病学领域引起了越来越多的兴奋;此类模型的应用为精神病学应用带来了独特的机遇和挑战。这篇综述文章旨在全面概述精神病学中的LLM,他们的模型架构,潜在的用例,和临床考虑。诸如ChatGPT/GPT-4之类的LLM框架是针对大量文本数据进行训练的,这些文本数据有时会针对特定任务进行微调。这开辟了广泛的可能的精神病学应用,例如准确预测特定疾病的个体患者风险因素,从事治疗干预,分析治疗材料,仅举几例。然而,在精神病学环境中收养会带来许多挑战,包括LLM的固有限制和偏见,对可解释性和隐私的担忧,以及产生的错误信息造成的潜在损害。这篇综述涵盖了潜在的机会和局限性,并强调了在现实世界的精神病学背景下应用这些模型时的潜在考虑因素。
    The ability of Large Language Models (LLMs) to analyze and respond to freely written text is causing increasing excitement in the field of psychiatry; the application of such models presents unique opportunities and challenges for psychiatric applications. This review article seeks to offer a comprehensive overview of LLMs in psychiatry, their model architecture, potential use cases, and clinical considerations. LLM frameworks such as ChatGPT/GPT-4 are trained on huge amounts of text data that are sometimes fine-tuned for specific tasks. This opens up a wide range of possible psychiatric applications, such as accurately predicting individual patient risk factors for specific disorders, engaging in therapeutic intervention, and analyzing therapeutic material, to name a few. However, adoption in the psychiatric setting presents many challenges, including inherent limitations and biases in LLMs, concerns about explainability and privacy, and the potential damage resulting from produced misinformation. This review covers potential opportunities and limitations and highlights potential considerations when these models are applied in a real-world psychiatric context.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由于符号任务计划易于理解和部署在工程机器人体系结构中,因此它是一种广泛用于实施机器人自主性的方法。然而,符号任务计划的技术很难在现实世界中扩展,高度动态,人机协作方案,因为在行动效果可能不立竿见影的规划领域表现不佳,或由于机器人工作区环境的变化而需要频繁的重新规划。长期计划的有效性,计划长度,和计划时间可能会阻碍机器人的效率,并对整体人机交互的流畅性产生负面影响。我们提出了一个框架,我们称之为照烧,专门旨在弥合符号任务计划和机器学习方法之间的差距。其基本原理是训练大型语言模型(LLM),即GPT-3,与计划域定义语言(PDDL)兼容的神经符号任务计划器,然后利用其生成能力来克服象征性任务计划者固有的一些限制。潜在的好处包括(I)在规划域复杂性增加的情况下具有更好的可扩展性,由于LLM的响应时间与输入和输出的组合长度成线性比例,而不是像象征性任务计划者那样的超线性,和(ii)综合行动而不是端到端的计划的能力,并使每个操作在生成后立即可执行,而不是等待整个计划可用,这反过来又实现了并发计划和执行。在过去的一年里,研究界已经付出了巨大的努力来评估LLM的整体认知能力,交替的成功。相反,使用Teriyaki,我们的目标是在特定规划领域提供与传统规划师相当的总体规划性能,在利用其他指标中的LLM功能的同时,特别是那些与他们的短期和中期生成能力有关的,用于构建前瞻预测规划模型。选定领域的初步结果表明,我们的方法可以:(i)解决1,000个样本的测试数据集中的95.5%的问题;(ii)产生比传统符号计划者短13.5%的计划;(iii)将计划可用性的平均总等待时间减少多达61.4%。
    Symbolic task planning is a widely used approach to enforce robot autonomy due to its ease of understanding and deployment in engineered robot architectures. However, techniques for symbolic task planning are difficult to scale in real-world, highly dynamic, human-robot collaboration scenarios because of the poor performance in planning domains where action effects may not be immediate, or when frequent re-planning is needed due to changed circumstances in the robot workspace. The validity of plans in the long term, plan length, and planning time could hinder the robot\'s efficiency and negatively affect the overall human-robot interaction\'s fluency. We present a framework, which we refer to as Teriyaki, specifically aimed at bridging the gap between symbolic task planning and machine learning approaches. The rationale is training Large Language Models (LLMs), namely GPT-3, into a neurosymbolic task planner compatible with the Planning Domain Definition Language (PDDL), and then leveraging its generative capabilities to overcome a number of limitations inherent to symbolic task planners. Potential benefits include (i) a better scalability in so far as the planning domain complexity increases, since LLMs\' response time linearly scales with the combined length of the input and the output, instead of super-linearly as in the case of symbolic task planners, and (ii) the ability to synthesize a plan action-by-action instead of end-to-end, and to make each action available for execution as soon as it is generated instead of waiting for the whole plan to be available, which in turn enables concurrent planning and execution. In the past year, significant efforts have been devoted by the research community to evaluate the overall cognitive capabilities of LLMs, with alternate successes. Instead, with Teriyaki we aim to providing an overall planning performance comparable to traditional planners in specific planning domains, while leveraging LLMs capabilities in other metrics, specifically those related to their short- and mid-term generative capabilities, which are used to build a look-ahead predictive planning model. Preliminary results in selected domains show that our method can: (i) solve 95.5% of problems in a test data set of 1,000 samples; (ii) produce plans up to 13.5% shorter than a traditional symbolic planner; (iii) reduce average overall waiting times for a plan availability by up to 61.4%.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:人工智能(AI)的集成,特别是深度学习模型,改变了医疗技术的格局,特别是在使用成像和生理数据的诊断领域。在耳鼻喉科,AI在中耳疾病的图像分类中显示出希望。然而,现有的模型通常缺乏患者特定的数据和临床背景,限制其普遍适用性。GPT-4Vision(GPT-4V)的出现使得多模态诊断方法成为可能,将语言处理与图像分析相结合。
    目的:在本研究中,我们通过整合患者特异性数据和耳镜下鼓膜图像,研究了GPT-4V在诊断中耳疾病中的有效性.
    方法:本研究的设计分为两个阶段:(1)建立具有适当提示的模型和(2)验证最佳提示模型对图像进行分类的能力。总的来说,305个中耳疾病的耳镜图像(急性中耳炎,中耳胆脂瘤,慢性中耳炎,和渗出性中耳炎)来自2010年4月至2023年12月期间访问新州大学或济池医科大学的患者。使用提示和患者数据建立优化的GPT-4V设置,并使用最佳提示创建的模型来验证GPT-4V在190张图像上的诊断准确性。为了比较GPT-4V与医生的诊断准确性,30名临床医生完成了由190张图像组成的基于网络的问卷。
    结果:多模态人工智能方法实现了82.1%的准确率,优于认证儿科医生的70.6%,但落后于耳鼻喉科医生的95%以上。该模型对急性中耳炎的疾病特异性准确率为89.2%,76.5%为慢性中耳炎,79.3%为中耳胆脂瘤,渗出性中耳炎占85.7%,这突出了对疾病特异性优化的需求。与医生的比较显示了有希望的结果,提示GPT-4V增强临床决策的潜力。
    结论:尽管有其优势,必须解决数据隐私和道德考虑等挑战。总的来说,这项研究强调了多模式AI在提高诊断准确性和改善耳鼻喉科患者护理方面的潜力.需要进一步的研究以在不同的临床环境中优化和验证这种方法。
    The integration of artificial intelligence (AI), particularly deep learning models, has transformed the landscape of medical technology, especially in the field of diagnosis using imaging and physiological data. In otolaryngology, AI has shown promise in image classification for middle ear diseases. However, existing models often lack patient-specific data and clinical context, limiting their universal applicability. The emergence of GPT-4 Vision (GPT-4V) has enabled a multimodal diagnostic approach, integrating language processing with image analysis.
    In this study, we investigated the effectiveness of GPT-4V in diagnosing middle ear diseases by integrating patient-specific data with otoscopic images of the tympanic membrane.
    The design of this study was divided into two phases: (1) establishing a model with appropriate prompts and (2) validating the ability of the optimal prompt model to classify images. In total, 305 otoscopic images of 4 middle ear diseases (acute otitis media, middle ear cholesteatoma, chronic otitis media, and otitis media with effusion) were obtained from patients who visited Shinshu University or Jichi Medical University between April 2010 and December 2023. The optimized GPT-4V settings were established using prompts and patients\' data, and the model created with the optimal prompt was used to verify the diagnostic accuracy of GPT-4V on 190 images. To compare the diagnostic accuracy of GPT-4V with that of physicians, 30 clinicians completed a web-based questionnaire consisting of 190 images.
    The multimodal AI approach achieved an accuracy of 82.1%, which is superior to that of certified pediatricians at 70.6%, but trailing behind that of otolaryngologists at more than 95%. The model\'s disease-specific accuracy rates were 89.2% for acute otitis media, 76.5% for chronic otitis media, 79.3% for middle ear cholesteatoma, and 85.7% for otitis media with effusion, which highlights the need for disease-specific optimization. Comparisons with physicians revealed promising results, suggesting the potential of GPT-4V to augment clinical decision-making.
    Despite its advantages, challenges such as data privacy and ethical considerations must be addressed. Overall, this study underscores the potential of multimodal AI for enhancing diagnostic accuracy and improving patient care in otolaryngology. Further research is warranted to optimize and validate this approach in diverse clinical settings.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    大型语言模型(LLM)在临床信息处理中起着至关重要的作用,展示跨不同语言任务的强大概括。然而,现有LLM,尽管意义重大,缺乏临床应用的优化,在幻想和可解释性方面提出挑战。检索增强生成(RAG)模型通过提供答案生成的来源来解决这些问题,从而减少错误。本研究探讨RAG技术在临床胃肠病学中的应用,以增强对胃肠道疾病的知识生成。
    我们使用由25个胃肠道疾病指南组成的语料库对嵌入模型进行了微调。与基础模型相比,微调模型的命中率提高了18%,gte-base-zh.此外,它的性能优于OpenAI的嵌入模型20%。使用带有骆驼索引的RAG框架,我们开发了一个中国胃肠病学聊天机器人,名为“胃机器人”,“这显著提高了答案的准确性和上下文相关性,最大限度地减少错误和传播误导性信息的风险。
    在使用RAGAS框架评估GastroBot时,我们观察到95%的上下文召回率。对源头的忠诚,为93.73%。答案的相关性表现出很强的相关性,达到92.28%。这些发现强调了GastroBot在提供有关胃肠道疾病的准确和上下文相关信息方面的有效性。在对GastroBot进行手动评估期间,与其他型号相比,我们的GastroBot模型提供了大量有价值的知识,同时确保结果的完整性和一致性。
    研究结果表明,将RAG方法纳入临床胃肠病学可以增强大型语言模型的准确性和可靠性。作为该方法的实际实现,GastroBot在上下文理解和响应质量方面表现出显着增强。模型的不断探索和完善有望推动胃肠病学领域的临床信息处理和决策支持。
    UNASSIGNED: Large Language Models (LLMs) play a crucial role in clinical information processing, showcasing robust generalization across diverse language tasks. However, existing LLMs, despite their significance, lack optimization for clinical applications, presenting challenges in terms of illusions and interpretability. The Retrieval-Augmented Generation (RAG) model addresses these issues by providing sources for answer generation, thereby reducing errors. This study explores the application of RAG technology in clinical gastroenterology to enhance knowledge generation on gastrointestinal diseases.
    UNASSIGNED: We fine-tuned the embedding model using a corpus consisting of 25 guidelines on gastrointestinal diseases. The fine-tuned model exhibited an 18% improvement in hit rate compared to its base model, gte-base-zh. Moreover, it outperformed OpenAI\'s Embedding model by 20%. Employing the RAG framework with the llama-index, we developed a Chinese gastroenterology chatbot named \"GastroBot,\" which significantly improves answer accuracy and contextual relevance, minimizing errors and the risk of disseminating misleading information.
    UNASSIGNED: When evaluating GastroBot using the RAGAS framework, we observed a context recall rate of 95%. The faithfulness to the source, stands at 93.73%. The relevance of answers exhibits a strong correlation, reaching 92.28%. These findings highlight the effectiveness of GastroBot in providing accurate and contextually relevant information about gastrointestinal diseases. During manual assessment of GastroBot, in comparison with other models, our GastroBot model delivers a substantial amount of valuable knowledge while ensuring the completeness and consistency of the results.
    UNASSIGNED: Research findings suggest that incorporating the RAG method into clinical gastroenterology can enhance the accuracy and reliability of large language models. Serving as a practical implementation of this method, GastroBot has demonstrated significant enhancements in contextual comprehension and response quality. Continued exploration and refinement of the model are poised to drive forward clinical information processing and decision support in the gastroenterology field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    耳鼻咽喉头颈外科人工智能(AI)生成模型的日益发展将逐步改变我们的实践。从业者和患者可以访问AI资源,改善信息,知识,和病人护理的实践。本文总结了目前研究的AI生成模型的应用,特别是Chatbot生成预训练变压器,耳鼻咽喉头颈外科.
    The increasing development of artificial intelligence (AI) generative models in otolaryngology-head and neck surgery will progressively change our practice. Practitioners and patients have access to AI resources, improving information, knowledge, and practice of patient care. This article summarizes the currently investigated applications of AI generative models, particularly Chatbot Generative Pre-trained Transformer, in otolaryngology-head and neck surgery.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号