关键词: ChatGPT Theoretical Domains Framework natural language processing qualitative content analysis

Mesh : Qualitative Research Humans

来  源:   DOI:10.2196/59050   PDF(Pubmed)

Abstract:
BACKGROUND: Data analysis approaches such as qualitative content analysis are notoriously time and labor intensive because of the time to detect, assess, and code a large amount of data. Tools such as ChatGPT may have tremendous potential in automating at least some of the analysis.
OBJECTIVE: The aim of this study was to explore the utility of ChatGPT in conducting qualitative content analysis through the analysis of forum posts from people sharing their experiences on reducing their sugar consumption.
METHODS: Inductive and deductive content analysis were performed on 537 forum posts to detect mechanisms of behavior change. Thorough prompt engineering provided appropriate instructions for ChatGPT to execute data analysis tasks. Data identification involved extracting change mechanisms from a subset of forum posts. The precision of the extracted data was assessed through comparison with human coding. On the basis of the identified change mechanisms, coding schemes were developed with ChatGPT using data-driven (inductive) and theory-driven (deductive) content analysis approaches. The deductive approach was informed by the Theoretical Domains Framework using both an unconstrained coding scheme and a structured coding matrix. In total, 10 coding schemes were created from a subset of data and then applied to the full data set in 10 new conversations, resulting in 100 conversations each for inductive and unconstrained deductive analysis. A total of 10 further conversations coded the full data set into the structured coding matrix. Intercoder agreement was evaluated across and within coding schemes. ChatGPT output was also evaluated by the researchers to assess whether it reflected prompt instructions.
RESULTS: The precision of detecting change mechanisms in the data subset ranged from 66% to 88%. Overall κ scores for intercoder agreement ranged from 0.72 to 0.82 across inductive coding schemes and from 0.58 to 0.73 across unconstrained coding schemes and structured coding matrix. Coding into the best-performing coding scheme resulted in category-specific κ scores ranging from 0.67 to 0.95 for the inductive approach and from 0.13 to 0.87 for the deductive approaches. ChatGPT largely followed prompt instructions in producing a description of each coding scheme, although the wording for the inductively developed coding schemes was lengthier than specified.
CONCLUSIONS: ChatGPT appears fairly reliable in assisting with qualitative analysis. ChatGPT performed better in developing an inductive coding scheme that emerged from the data than adapting an existing framework into an unconstrained coding scheme or coding directly into a structured matrix. The potential for ChatGPT to act as a second coder also appears promising, with almost perfect agreement in at least 1 coding scheme. The findings suggest that ChatGPT could prove useful as a tool to assist in each phase of qualitative content analysis, but multiple iterations are required to determine the reliability of each stage of analysis.
摘要:
背景:定性内容分析等数据分析方法是众所周知的时间和劳动密集型,因为需要时间来检测,评估,并编码大量数据。诸如ChatGPT之类的工具在自动化至少一些分析方面可能具有巨大的潜力。
目的:本研究的目的是通过分析来自分享减少糖消耗经验的人的论坛帖子,探索ChatGPT在进行定性内容分析中的效用。
方法:对537个论坛帖子进行归纳和演绎内容分析,以检测行为改变的机制。彻底的提示工程为ChatGPT执行数据分析任务提供了适当的说明。数据识别涉及从论坛帖子的子集中提取变更机制。通过与人类编码进行比较来评估所提取数据的精度。根据已确定的变革机制,编码方案是使用数据驱动(归纳)和理论驱动(演绎)内容分析方法与ChatGPT开发的。理论域框架使用无约束编码方案和结构化编码矩阵提供了演绎方法。总的来说,从数据子集创建10个编码方案,然后在10个新对话中应用于完整数据集,产生100次对话,每次进行归纳和无约束演绎分析。总共10个另外的对话将完整数据集编码到结构化编码矩阵中。跨编码方案和编码方案内对编码器协议进行了评估。研究人员还对ChatGPT输出进行了评估,以评估其是否反映了提示。
结果:检测数据子集中的变化机制的精度范围为66%至88%。在感应编码方案中,编码器间协议的总体κ分数在0.72到0.82之间,在无约束编码方案和结构化编码矩阵中,从0.58到0.73之间。编码到性能最佳的编码方案中,归纳方法的类别特定κ分数为0.67至0.95,演绎方法的类别特定κ分数为0.13至0.87。ChatGPT在生成每个编码方案的描述时很大程度上遵循提示指令,虽然归纳发展的编码方案的措辞比规定的要长。
结论:ChatGPT在协助定性分析方面似乎相当可靠。ChatGPT在开发从数据中出现的归纳编码方案方面表现更好,而不是将现有框架调整为无约束编码方案或直接编码为结构化矩阵。ChatGPT作为第二个编码器的潜力似乎也很有希望,在至少1种编码方案中几乎完全吻合。研究结果表明,ChatGPT可以作为一种工具来协助定性内容分析的每个阶段,但是需要多次迭代来确定每个分析阶段的可靠性。
公众号