UNASSIGNED: In a randomized controlled experiment, n = 164 participants read six texts, three of which had a moral and three a technological topic (predictor text topic). The alleged author of each text was randomly either labeled \"ChatGPT\" or \"human author\" (predictor authorship). We captured three dependent variables: assessment of author competence, assessment of content quality, and participants\' intention to submit the text in a hypothetical university course (sharing intention). We hypothesized interaction effects, that is, we expected ChatGPT to score lower than alleged human authors for moral topics and higher than alleged human authors for technological topics and vice versa.
UNASSIGNED: We only found a small interaction effect for perceived author competence, p = 0.004, d = 0.40, but not for the other dependent variables. However, ChatGPT was consistently devalued compared to alleged human authors across all dependent variables: there were main effects of authorship for assessment of the author competence, p < 0.001, d = 0.95; for assessment of content quality, p < 0.001, d = 0.39; as well as for sharing intention, p < 0.001, d = 0.57. There was also a small main effect of text topic on the assessment of text quality, p = 0.002, d = 0.35.
UNASSIGNED: These results are more in line with previous findings on algorithm aversion than with algorithm appreciation. We discuss the implications of these findings for the acceptance of the use of LLMs for text composition.
■在一项随机对照实验中,n=164名参与者阅读了六篇课文,其中三个有道德主题,三个有技术主题(预测文本主题)。每个文本的所谓作者被随机标记为“ChatGPT”或“人类作者”(预测作者身份)。我们捕获了三个因变量:作者能力评估,内容质量评估,和参与者打算在假设的大学课程中提交文本(共享意图)。我们假设相互作用效应,也就是说,我们预计ChatGPT在道德主题上的得分低于所谓的人类作者,在技术主题上的得分高于所谓的人类作者,反之亦然。
■我们只发现对感知的作者能力有很小的交互作用,p=0.004,d=0.40,但对于其他因变量则不是。然而,在所有因变量中,与所谓的人类作者相比,ChatGPT始终贬值:作者身份对评估作者能力有主要影响,p<0.001,d=0.95;对于内容质量评估,p<0.001,d=0.39;对于共享意图,p<0.001,d=0.57。文本主题对文本质量的评估也有很小的主要影响,p=0.002,d=0.35。
■这些结果更符合先前关于算法厌恶的发现,而不是算法欣赏。我们讨论了这些发现对接受使用LLM进行文本撰写的影响。