Generative AI

创成式 AI
  • 文章类型: Journal Article
    使用模拟患者模拟9种既定的非传染性和传染性疾病,我们评估了ChatGPT在低收入和中等收入国家常见疾病治疗建议中的表现.ChatGPT在正确的诊断(20/27,74%)和药物处方(22/27,82%)方面都具有很高的准确性,但即使有正确的诊断,不必要或有害的药物(23/27,85%)也令人担忧。ChatGPT在管理非传染性疾病方面比传染性疾病表现更好。这些结果凸显了在医疗保健系统中谨慎整合AI以确保质量和安全的必要性。
    Using simulated patients to mimic 9 established noncommunicable and infectious diseases, we assessed ChatGPT\'s performance in treatment recommendations for common diseases in low- and middle-income countries. ChatGPT had a high level of accuracy in both correct diagnoses (20/27, 74%) and medication prescriptions (22/27, 82%) but a concerning level of unnecessary or harmful medications (23/27, 85%) even with correct diagnoses. ChatGPT performed better in managing noncommunicable diseases than infectious ones. These results highlight the need for cautious AI integration in health care systems to ensure quality and safety.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Letter
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    以ChatGPT为代表的生成AI工具正在成为新的现实。这项研究的动机是“人工智能生成的内容可能表现出一种独特的行为,可以从科学文章中分离出来”。在这项研究中,我们展示了如何使用针对各种疾病和状况的即时工程手段来生成文章。然后,我们展示了我们如何在两个阶段测试这个前提,并证明其有效性。随后,我们介绍xFakeSci,一种新颖的学习算法,能够将ChatGPT生成的文章与科学家制作的出版物区分开来。使用从两个源驱动的网络模型来训练该算法。为了缓解过度拟合问题,我们结合了一个基于数据驱动的启发式算法的校准步骤,包括接近度和比率。具体来说,从总共3952件针对三种不同医疗状况的假物品中,该算法只使用100篇文章进行了训练,但是使用100条的折叠校准。至于分类步骤,每个条件使用300篇文章进行。实际的标签步骤是针对50个生成的文章和50个真实的PubMed摘要的相等组合进行的。该测试还跨越了2010年至2024年的出版期,涵盖了对三种不同疾病的研究:癌症,抑郁症,和老年痴呆症。Further,我们评估了xFakeSci算法与一些经典数据挖掘算法的准确性(例如,支持向量机,回归,和朴素贝叶斯)。xFakeSci算法获得了80%至94%的F1分数,优于常见的数据挖掘算法,其F1值在38%至52%之间。我们将明显的差异归因于校准和接近距离启发式的引入,这突显了这一充满希望的表现。的确,对ChatGPT产生的假科学的预测提出了相当大的挑战。尽管如此,xFakeSci算法的引入是打击假科学的重要一步。
    Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that \"AI generated content may exhibit a distinctive behavior that can be separated from scientific articles\". In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources. To mitigate overfitting issues, we incorporated a calibration step that is built upon data-driven heuristics, including proximity and ratios. Specifically, from a total of a 3952 fake articles for three different medical conditions, the algorithm was trained using only 100 articles, but calibrated using folds of 100 articles. As for the classification step, it was performed using 300 articles per condition. The actual label steps took place against an equal mix of 50 generated articles and 50 authentic PubMed abstracts. The testing also spanned publication periods from 2010 to 2024 and encompassed research on three distinct diseases: cancer, depression, and Alzheimer\'s. Further, we evaluated the accuracy of the xFakeSci algorithm against some of the classical data mining algorithms (e.g., Support Vector Machines, Regression, and Naive Bayes). The xFakeSci algorithm achieved F1 scores ranging from 80 to 94%, outperforming common data mining algorithms, which scored F1 values between 38 and 52%. We attribute the noticeable difference to the introduction of calibration and a proximity distance heuristic, which underscores this promising performance. Indeed, the prediction of fake science generated by ChatGPT presents a considerable challenge. Nonetheless, the introduction of the xFakeSci algorithm is a significant step on the way to combating fake science.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    更聪明的城市应该是更安全的城市。大城市地区的夜间安全一直是全球关注的问题,特别是对于人口结构多样、城市形态复杂的大城市,其公民经常受到街头犯罪率较高的威胁。然而,由于缺乏夜间城市外观数据,先前基于街景图像(SVI)的研究很少解决夜间安全问题,这可以对预防犯罪产生重要影响。这项研究假设夜间SVI可以使用生成AI(GenAI)从广泛存在的白天SVI中有效生成。为了检验假设,这项研究首先在城市景观不同的四个城市中收集成对的昼夜SVI,以构建一个全面的昼夜SVI数据集。然后,它通过微调亮度调节来训练和验证昼夜(D2N)模型,有效地将白天的SVI转换为夜间的SVI,以实现为城市场景感知研究量身定制的独特城市形式。我们的研究结果表明:(1)D2N转换的性能因与城市密度相关的城市景观变化而显着变化;(2)建筑物和天空景观的比例是转换精度的重要决定因素;(3)在流行模型中,CycleGAN保持D2N场景转换的一致性,但需要丰富的数据。Pix2Pix在昼夜成对SVIs可用且对数据质量敏感时可实现相当高的准确性。StableDiffusion以昂贵的培训成本产生高质量的图像。因此,CycleGAN在平衡精度方面最有效,数据要求,和成本。这项研究通过构建由各种城市形式的成对昼夜SVIs组成的同类D2N数据集,为城市场景研究做出了贡献。D2N生成器将为未来城市研究提供基石,这些研究将大量利用SVI来审核城市环境。
    A smarter city should be a safer city. Nighttime safety in metropolitan areas has long been a global concern, particularly for large cities with diverse demographics and intricate urban forms, whose citizens are often threatened by higher street-level crime rates. However, due to the lack of night-time urban appearance data, prior studies based on street view imagery (SVI) rarely addressed the perceived night-time safety issue, which can generate important implications for crime prevention. This study hypothesizes that night-time SVI can be effectively generated from widely existing daytime SVIs using generative AI (GenAI). To test the hypothesis, this study first collects pairwise day-and-night SVIs across four cities diverged in urban landscapes to construct a comprehensive day-and-night SVI dataset. It then trains and validates a day-to-night (D2N) model with fine-tuned brightness adjustment, effectively transforming daytime SVIs to nighttime ones for distinct urban forms tailored for urban scene perception studies. Our findings indicate that: (1) the performance of D2N transformation varies significantly by urban-scape variations related to urban density; (2) the proportion of building and sky views are important determinants of transformation accuracy; (3) within prevailed models, CycleGAN maintains the consistency of D2N scene conversion, but requires abundant data. Pix2Pix achieves considerable accuracy when pairwise day-and-night-night SVIs are available and are sensitive to data quality. StableDiffusion yields high-quality images with expensive training costs. Therefore, CycleGAN is most effective in balancing the accuracy, data requirement, and cost. This study contributes to urban scene studies by constructing a first-of-its-kind D2N dataset consisting of pairwise day-and-night SVIs across various urban forms. The D2N generator will provide a cornerstone for future urban studies that heavily utilize SVIs to audit urban environments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    对ChatGPT-3.5等生成AI模型在医疗保健中的潜在应用的兴趣日益增加,促使许多研究探索其在各种医疗环境中的性能。然而,由于ChatGPT反应的固有随机性,评估ChatGPT提出了独特的挑战。与传统的AI模型不同,ChatGPT为相同的输入生成不同的响应,必须通过重复来评估其稳定性。本评论强调了在ChatGPT的评估中包括重复的重要性,以确保从其性能中得出的结论的可靠性。类似于生物实验,通常需要多次重复才能获得有效性,我们认为,评估像ChatGPT这样的生成式人工智能模型需要类似的方法。不承认重复的影响会导致有偏见的结论,并破坏研究结果的可信度。我们敦促研究人员从一开始就在他们的研究中纳入适当的重复,并透明地报告他们的方法,以增强这一快速发展领域研究结果的稳健性和可重复性。
    The increasing interest in the potential applications of generative artificial intelligence (AI) models like ChatGPT in health care has prompted numerous studies to explore its performance in various medical contexts. However, evaluating ChatGPT poses unique challenges due to the inherent randomness in its responses. Unlike traditional AI models, ChatGPT generates different responses for the same input, making it imperative to assess its stability through repetition. This commentary highlights the importance of including repetition in the evaluation of ChatGPT to ensure the reliability of conclusions drawn from its performance. Similar to biological experiments, which often require multiple repetitions for validity, we argue that assessing generative AI models like ChatGPT demands a similar approach. Failure to acknowledge the impact of repetition can lead to biased conclusions and undermine the credibility of research findings. We urge researchers to incorporate appropriate repetition in their studies from the outset and transparently report their methods to enhance the robustness and reproducibility of findings in this rapidly evolving field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    大型语言模型(LLM)有可能通过它们生成的内容来改变我们的生活和工作,被称为AI生成内容(AIGC)。为了利用这种转变,我们需要了解LLM的局限性。在这里,我们调查了由七个代表性LLM产生的AIGC偏倚,包括ChatGPT和LLaMA.我们收集纽约时报和路透社的新闻报道,两者都以提供公正的新闻而闻名。然后,我们应用每个被检查的LLM来生成新闻内容,这些新闻文章的标题作为提示,并通过比较AIGC和原始新闻文章来评估LLM制作的AIGC的性别和种族偏见。我们通过在这些新闻标题构建的提示中添加性别偏见信息,进一步分析了每个LLM在有偏见提示下的性别偏见。我们的研究表明,每个被检查的LLM产生的AIGC都表现出实质性的性别和种族偏见。此外,每个LLM产生的AIGC都对黑人种族的女性和个人表现出明显的歧视。在LLM中,ChatGPT产生的AIGC显示出最低水平的偏差,ChatGPT是唯一能够在提供有偏见的提示时减少内容生成的模型。
    Large language models (LLMs) have the potential to transform our lives and work through the content they generate, known as AI-Generated Content (AIGC). To harness this transformation, we need to understand the limitations of LLMs. Here, we investigate the bias of AIGC produced by seven representative LLMs, including ChatGPT and LLaMA. We collect news articles from The New York Times and Reuters, both known for their dedication to provide unbiased news. We then apply each examined LLM to generate news content with headlines of these news articles as prompts, and evaluate the gender and racial biases of the AIGC produced by the LLM by comparing the AIGC and the original news articles. We further analyze the gender bias of each LLM under biased prompts by adding gender-biased messages to prompts constructed from these news headlines. Our study reveals that the AIGC produced by each examined LLM demonstrates substantial gender and racial biases. Moreover, the AIGC generated by each LLM exhibits notable discrimination against females and individuals of the Black race. Among the LLMs, the AIGC generated by ChatGPT demonstrates the lowest level of bias, and ChatGPT is the sole model capable of declining content generation when provided with biased prompts.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们部署了一个即时增强的GPT-4模型,以提炼出关于债务换自然互换(DNS)的全球应用的全面数据集,环境保护的关键金融工具。我们的分析包括195个国家,并确定了21个尚未使用DNS的国家/地区作为DNS的主要候选者。很大一部分表明了对保护金融的一致承诺(与历史掉期记录相比,准确率为0.86)。相反,2010年以前在DNS中活跃的35个国家/地区已被确定为不适合。值得注意的是,阿根廷,努力应对飙升的通货膨胀和严重的主权债务危机,波兰,它实现了经济稳定,并获得了替代的欧盟保护基金,举例说明变化的适宜性景观。该研究的结果阐明了DNS作为经济和政治动荡中的保护策略的脆弱性。
    We deploy a prompt-augmented GPT-4 model to distill comprehensive datasets on the global application of debt-for-nature swaps (DNS), a pivotal financial tool for environmental conservation. Our analysis includes 195 nations and identifies 21 countries that have not yet used DNS before as prime candidates for DNS. A significant proportion demonstrates consistent commitments to conservation finance (0.86 accuracy as compared to historical swaps records). Conversely, 35 countries previously active in DNS before 2010 have since been identified as unsuitable. Notably, Argentina, grappling with soaring inflation and a substantial sovereign debt crisis, and Poland, which has achieved economic stability and gained access to alternative EU conservation funds, exemplify the shifting suitability landscape. The study\'s outcomes illuminate the fragility of DNS as a conservation strategy amid economic and political volatility.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号