在聊天机器人的兴起中，对人工智能系统判断道德和能力的能力的看法。Perceptions of artificial intelligence system's aptitude to judge morality and competence amidst the rise of Chatbots.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

This paper examines how humans judge the capabilities of artificial intelligence (AI) to evaluate human attributes, specifically focusing on two key dimensions of human social evaluation: morality and competence. Furthermore, it investigates the impact of exposure to advanced Large Language Models on these perceptions. In three studies (combined N = 200), we tested the hypothesis that people will find it less plausible that AI is capable of judging the morality conveyed by a behavior compared to judging its competence. Participants estimated the plausibility of AI origin for a set of written impressions of positive and negative behaviors related to morality and competence. Studies 1 and 3 supported our hypothesis that people would be more inclined to attribute AI origin to competence-related impressions compared to morality-related ones. In Study 2, we found this effect only for impressions of positive behaviors. Additional exploratory analyses clarified that the differentiation between the AI origin of competence and morality judgments persisted throughout the first half year after the public launch of popular AI chatbot (i.e., ChatGPT) and could not be explained by participants\' general attitudes toward AI, or the actual source of the impressions (i.e., AI or human). These findings suggest an enduring belief that AI is less adept at assessing the morality compared to the competence of human behavior, even as AI capabilities continued to advance.

摘要：

本文研究了人类如何判断人工智能(AI)评估人类属性的能力，特别关注人类社会评价的两个关键维度:道德和能力。此外，它调查了接触高级大型语言模型对这些感知的影响。在三项研究(合并N=200)中，我们测试了这样一个假设，即人们会发现，与判断人工智能的能力相比，人工智能能够判断一种行为所传达的道德是不那么合理的。参与者估计了AI起源对一系列与道德和能力有关的积极和消极行为的书面印象的合理性。研究1和3支持了我们的假设，即与道德相关的印象相比，人们更倾向于将AI起源归因于与能力相关的印象。在研究2中，我们发现这种效果仅适用于积极行为的印象。额外的探索性分析澄清了，在流行的AI聊天机器人公开推出后的上半年，AI能力起源和道德判断之间的差异一直存在(即，ChatGPT)，不能用参与者对人工智能的一般态度来解释，或印象的实际来源(即，AI或人类)。这些发现表明了一种持久的信念，即与人类行为的能力相比，人工智能在评估道德方面不太熟练。即使AI能力继续进步。