%0 Journal Article %T Guideline appraisal with AGREE II: Systematic review of the current evidence on how users handle the 2 overall assessments. %A Hoffmann-Eßer W %A Siering U %A Neugebauer EA %A Brockhaus AC %A Lampert U %A Eikermann M %J PLoS One %V 12 %N 3 %D 2017 %M 28358870 %F 3.752 %R 10.1371/journal.pone.0174831 %X BACKGROUND: The Appraisal of Guidelines for Research & Evaluation (AGREE) II instrument is the most commonly used guideline appraisal tool. It includes 23 appraisal criteria (items) organized within 6 domains and 2 overall assessments (1. overall guideline quality; 2. recommendation for use). The aim of this systematic review was twofold. Firstly, to investigate how often AGREE II users conduct the 2 overall assessments. Secondly, to investigate the influence of the 6 domain scores on each of the 2 overall assessments.
METHODS: A systematic bibliographic search was conducted for publications reporting guideline appraisals with AGREE II. The impact of the 6 domain scores on the overall assessment of guideline quality was examined using a multiple linear regression model. Their impact on the recommendation for use (possible answers: "yes", "yes, with modifications", "no") was examined using a multinomial regression model.
RESULTS: 118 relevant publications including 1453 guidelines were identified. 77.1% of the publications reported results for at least one overall assessment, but only 32.2% reported results for both overall assessments. The results of the regression analyses showed a statistically significant influence of all domains on overall guideline quality, with Domain 3 (rigour of development) having the strongest influence. For the recommendation for use, the results showed a significant influence of Domains 3 to 5 ("yes" vs. "no") and Domains 3 and 5 ("yes, with modifications" vs. "no").
CONCLUSIONS: The 2 overall assessments of AGREE II are underreported by guideline assessors. Domains 3 and 5 have the strongest influence on the results of the 2 overall assessments, while the other domains have a varying influence. Within a normative approach, our findings could be used as guidance for weighting individual domains in AGREE II to make the overall assessments more objective. Alternatively, a stronger content analysis of the individual domains could clarify their importance in terms of guideline quality. Moreover, AGREE II should require users to transparently present how they conducted the assessments.