BACKGROUND: A novel type of item sets, \"f-type\" testlets, was recently introduced on the United States Medical Licensing Examination. These testlets contain two or more questions associated with a common clinical scenario. In some cases, as the scenario unfolds, examinees are indirectly provided with feedback about their response to a testlet question. The effects of this format and of the provision of feedback to examinees about their performance are investigated.
METHODS: Examinee behavior is predicted using an item response model, and observed examinee responses are compared with model expectations for f-type testlets. Mean model-data discrepancies among specific examinee groups are compared to study the dependencies across within-testlet items (i.e.,
case-specificity) and the impact of providing feedback.
RESULTS: Findings showed that
case-specificity effects were present (on average) for all examinee subgroups except examinees who both responded unsuccessfully to the initial item within an f-type testlet and received feedback.
Case-specificity effects were negative for examinees who responded unsuccessfully to the initial testlet item but did not receive feedback. For those who responded successfully to the initial testlet items,
case-specificity effects were positive.
CONCLUSIONS: Results suggest that responses to test questions within an f-type testlet are not independent-even after accounting for examinee proficiency and item characteristics.
Case-specificity effects (i.e., dependencies) were observed on average for all examinees except those who both responded unsuccessfully to the initial item within an f-type testlet and received feedback. Research into modeling these effects through the use of more general item response models is recommended.