关键词: Bayesian estimation diagnostic classification model forced-choice format pairwise comparison

来  源:   DOI:10.1177/00131644211069906   PDF(Pubmed)

Abstract:
The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs) can provide information regarding the mastery status of test takers on latent discrete variables and are more commonly used for cognitive tests employed in educational settings than for noncognitive tests. The purpose of this study is to develop a new class of DCM for FC items under the higher-order DCM framework to meet the practical demands of simultaneously controlling for response biases and providing diagnostic classification information. By conducting a series of simulations and calibrating the model parameters with a Bayesian estimation, the study shows that, in general, the model parameters can be recovered satisfactorily with the use of long tests and large samples. More attributes improve the precision of the second-order latent trait estimation in a long test, but decrease the classification accuracy and the estimation quality of the structural parameters. When statements are allowed to load on two distinct attributes in paired comparison items, the specific-attribute condition produces better a parameter estimation than the overlap-attribute condition. Finally, an empirical analysis related to work-motivation measures is presented to demonstrate the applications and implications of the new model.
摘要:
用于非认知测试的强制选择(FC)项目格式通常会开发一组响应选项,以衡量不同的特征,并指导受访者根据他们对控制通常在规范测试中观察到的响应偏差的偏好在这些选项中做出判断。诊断分类模型(DCM)可以提供有关考生对潜在离散变量的掌握状态的信息,并且比非认知测试更常用于教育环境中采用的认知测试。这项研究的目的是在高阶DCM框架下为FC项目开发一类新的DCM,以满足同时控制响应偏差并提供诊断分类信息的实际需求。通过进行一系列的模拟和校准模型参数与贝叶斯估计,研究表明,总的来说,通过长时间测试和大样本的使用,可以令人满意地恢复模型参数。更多的属性提高了长测试中二阶潜在性状估计的精度,但降低了结构参数的分类精度和估计质量。当允许在成对比较项中的两个不同属性上加载语句时,特定属性条件比重叠属性条件产生更好的参数估计。最后,提出了与工作动机测量相关的实证分析,以证明新模型的应用和含义。
公众号