MTMM experiments

  • 文章类型: Journal Article
    尽管大多数调查研究人员都认为可靠性是调查数据的关键要求,在国家调查中,评估答复可靠性的努力并不多。此外,研究调查回应的可靠性有相当不同的方法。在讲座的第一部分,我将时间一致性的心理学理论与使用重新访谈数据的三个统计模型进行了对比,多性状多方法实验,和三波面板数据来估计可靠性。更复杂的统计模型反映了对记忆效应和重新访谈研究中方法因素影响的担忧。在讲座的以下部分,我研究了有关可靠性的文献中的一些主要发现。尽管探索可靠性的方法存在差异,这些发现大多是一致的,确定相似的受访者和问题特征作为可靠性的主要决定因素。本文的下一部分研究了从不同方法得出的可靠性估计之间的相关性;它从传统的重新访谈研究中找到了一些支持措施有效性的支持。文献中没有强烈支持激发更复杂的可靠性估计方法的经验主张。可靠性是,在我的判断中,调查研究人员忽视的话题,我希望讲座能促进对调查问题可靠性的进一步研究。
    Although most survey researchers agree that reliability is a critical requirement for survey data, there have not been many efforts to assess the reliability of responses in national surveys. In addition, there are quite different approaches to studying the reliability of survey responses. In the first section of the Lecture, I contrast a psychological theory of over-time consistency with three statistical models that use reinterview data, multi-trait multi-method experiments, and three-wave panel data to estimate reliability. The more sophisticated statistical models reflect concerns about memory effects and the impact of method factors in reinterview studies. In the following section of the Lecture, I examine some of the major findings from the literature on reliability. Despite the differences across methods for exploring reliability, the findings mostly converge, identifying similar respondent and question characteristics as major determinants of reliability. The next section of the paper looks at the correlations among estimates of reliability derived from the different methods; it finds some support for the validity of the measures from traditional reinterview studies. The empirical claims motivating the more sophisticated methods for estimating reliability are not strongly supported in the literature. Reliability is, in my judgment, a neglected topic among survey researchers, and I hope the Lecture spurs further studies of the reliability of survey questions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    评估调查数据可靠性的通常方法是在初次访谈后的短时间间隔(如一到两周)进行重新访谈,并使用这些数据来估计相对简单的统计数据,例如总差异率(GDR)。还使用了更复杂的方法来估计可靠性。这些包括来自多性状的估计,多方法实验,应用于纵向数据的模型,和潜在的阶级分析。据我们所知,以前的研究没有系统地比较这些评估可靠性的不同方法.人群烟草与健康信度和效度评估(PATH-RV)研究,在国家概率样本上做的,评估了PATH研究中第4波问卷答案的可靠性。PATH-RV的受访者相隔约两周接受了两次采访。我们研究了经典的调查方法是否与更复杂的方法得出了不同的结论。我们还研究了两种事前方法,用于评估调查问题以及项目无响应率和响应时间的问题,以了解它们与不同的可靠性估计之间的关系。我们发现kappa与GDR和随时间的相关性高度相关,但是后两个统计数据的相关性较低,特别是对于成人受访者;在主要PATH研究中,对相同项目进行纵向分析得出的估计值也与传统的可靠性估计值高度相关.潜在的类分析结果,基于更少的项目,与传统措施也表现出高度的一致性。其他方法和指标充其量与从重新访谈数据得出的可靠性估计的关系较弱。尽管问题理解援助似乎利用了与其他措施不同的因素,对于成人受访者,它确实预测了项目无反应和反应延迟,因此可能是传统措施的有用辅助手段。
    The usual method for assessing the reliability of survey data has been to conduct reinterviews a short interval (such as one to two weeks) after an initial interview and to use these data to estimate relatively simple statistics, such as gross difference rates (GDRs). More sophisticated approaches have also been used to estimate reliability. These include estimates from multi-trait, multi-method experiments, models applied to longitudinal data, and latent class analyses. To our knowledge, no prior study has systematically compared these different methods for assessing reliability. The Population Assessment of Tobacco and Health Reliability and Validity (PATH-RV) Study, done on a national probability sample, assessed the reliability of answers to the Wave 4 questionnaire from the PATH Study. Respondents in the PATH-RV were interviewed twice about two weeks apart. We examined whether the classic survey approach yielded different conclusions from the more sophisticated methods. We also examined two ex ante methods for assessing problems with survey questions and item nonresponse rates and response times to see how strongly these related to the different reliability estimates. We found that kappa was highly correlated with both GDRs and over-time correlations, but the latter two statistics were less highly correlated, particularly for adult respondents; estimates from longitudinal analyses of the same items in the main PATH study were also highly correlated with the traditional reliability estimates. The latent class analysis results, based on fewer items, also showed a high level of agreement with the traditional measures. The other methods and indicators had at best weak relationships with the reliability estimates derived from the reinterview data. Although the Question Understanding Aid seems to tap a different factor from the other measures, for adult respondents, it did predict item nonresponse and response latencies and thus may be a useful adjunct to the traditional measures.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号