为定性访谈确定适当的样本量以实现真实和接近代码饱和：数据的二次分析。Determining an Appropriate Sample Size for Qualitative Interviews to Achieve True and Near Code Saturation: Secondary Analysis of Data.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

BACKGROUND: In-depth interviews are a common method of qualitative data collection, providing rich data on individuals\' perceptions and behaviors that would be challenging to collect with quantitative methods. Researchers typically need to decide on sample size a priori. Although studies have assessed when saturation has been achieved, there is no agreement on the minimum number of interviews needed to achieve saturation. To date, most research on saturation has been based on in-person data collection. During the COVID-19 pandemic, web-based data collection became increasingly common, as traditional in-person data collection was possible. Researchers continue to use web-based data collection methods post the COVID-19 emergency, making it important to assess whether findings around saturation differ for in-person versus web-based interviews.
OBJECTIVE: We aimed to identify the number of web-based interviews needed to achieve true code saturation or near code saturation.
METHODS: The analyses for this study were based on data from 5 Food and Drug Administration-funded studies conducted through web-based platforms with patients with underlying medical conditions or with health care providers who provide primary or specialty care to patients. We extracted code- and interview-specific data and examined the data summaries to determine when true saturation or near saturation was reached.
RESULTS: The sample size used in the 5 studies ranged from 30 to 70 interviews. True saturation was reached after 91% to 100% (n=30-67) of planned interviews, whereas near saturation was reached after 33% to 60% (n=15-23) of planned interviews. Studies that relied heavily on deductive coding and studies that had a more structured interview guide reached both true saturation and near saturation sooner. We also examined the types of codes applied after near saturation had been reached. In 4 of the 5 studies, most of these codes represented previously established core concepts or themes. Codes representing newly identified concepts, other or miscellaneous responses (eg, \"in general\"), uncertainty or confusion (eg, \"don\'t know\"), or categorization for analysis (eg, correct as compared with incorrect) were less commonly applied after near saturation had been reached.
CONCLUSIONS: This study provides support that near saturation may be a sufficient measure to target and that conducting additional interviews after that point may result in diminishing returns. Factors to consider in determining how many interviews to conduct include the structure and type of questions included in the interview guide, the coding structure, and the population under study. Studies with less structured interview guides, studies that rely heavily on inductive coding and analytic techniques, and studies that include populations that may be less knowledgeable about the topics discussed may require a larger sample size to reach an acceptable level of saturation. Our findings also build on previous studies looking at saturation for in-person data collection conducted at a small number of sites.

摘要：

背景：深度访谈是定性数据收集的常用方法，提供关于个人的观念和行为的丰富数据，用定量方法收集这些数据将是具有挑战性的。研究人员通常需要事先决定样本量。尽管研究已经评估了何时达到饱和，对于达到饱和所需的最低采访次数没有达成一致。迄今为止,大多数关于饱和度的研究都是基于现场数据收集。在COVID-19大流行期间，基于网络的数据收集变得越来越普遍，因为传统的面对面数据收集是可能的。研究人员在COVID-19紧急情况后继续使用基于网络的数据收集方法，这使得重要的是评估饱和的结果是否在面对面采访和基于网络的采访中有所不同。
目的：我们旨在确定实现真正代码饱和或接近代码饱和所需的基于网络的访谈数量。
方法：本研究的分析基于5项食品和药物管理局资助的研究的数据，这些研究是通过基于网络的平台进行的，这些平台对患有潜在医疗状况的患者或对患者提供初级或专科护理的医疗保健提供者进行的。我们提取了特定于代码和访谈的数据，并检查数据摘要，以确定何时达到真正的饱和或接近饱和。
结果：5项研究中使用的样本量范围为30至70次访谈。经过91%至100%(n=30-67)的计划面试后达到真正的饱和度，而在计划访谈的33%至60%(n=15-23)后达到接近饱和.严重依赖演绎编码的研究和具有更结构化访谈指南的研究更快地达到了真正的饱和和接近饱和。我们还检查了在达到接近饱和后应用的代码类型。在5项研究中的4项，这些代码中的大多数代表了以前建立的核心概念或主题。代表新确定的概念的代码，其他或杂项回应(例如，\"ingeneral\"),不确定性或困惑(例如，\“不知道\”)，或用于分析的分类(例如，与不正确的相比，正确的）在达到接近饱和后较不常见。
结论：这项研究提供了支持，即接近饱和可能是一个足够的指标，并且在这一点之后进行额外的访谈可能会导致收益递减。在决定进行多少次面试时要考虑的因素包括面试指南中包含的问题的结构和类型，编码结构，和正在研究的人口。具有较少的结构化面试指南的研究，严重依赖归纳编码和分析技术的研究，以及包括可能对所讨论主题了解较少的人群的研究可能需要更大的样本量才能达到可接受的饱和水平。我们的发现还建立在先前的研究基础上，这些研究着眼于在少数地点进行的现场数据收集的饱和度。