Background: Guidelines of high methodological quality make an essential contribution to the quality assurance of medical knowledge. The detailed evaluation of guideline quality is a complex and time-consuming task. The answers to a few key questions generally suffice for an initial, rapid assessment of the quality and utility of a guideline.
Method: We selectively searched the pertinent literature for guideline-assessing instruments and analyzed selected ones with respect to their target group, purpose, orientation, and comprehensiveness. We identified key questions from brief instruments that can be used to assess guideline quality rapidly.
Results: A comparison of ten instruments revealed that most were designed to provide a highly detailed assessment of guideline quality. Four recently developed instruments enable a rough and rapid assessment. They focus, in essence, on four key questions: Was the evidence analyzed systematically? Does the evidence support the recommendations? Is the goal of the guideline formulated, and are the authors named? Is the organization of the guideline easy to follow, and are the recommendations clearly signposted?
Conclusion: Alongside the comprehensive instruments for assessing guidelines, such as DELBI and AGREE II, rapid-assessment instruments are a convenient tool for gaining a quick impression of the value of a guideline.
Guidelines, as defined by the Institute of Medicine (IOM), are aids to decision-making that have been systematically developed for the management of specific problems in medicine. Their purpose is to improve patient care (1, 2). The Association of Scientific Medical Societies in Germany (Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften, AWMF) classifies guidelines into three categories: S1 expert recommendations contain treatment recommendations developed by informal consensus of a group of experts, S2 guidelines are created by formal consensus-finding and/or a formal search for evidence, and S3 guidelines contain all of the elements of systematic development (3). Guidelines are now meeting with increasing acceptance among German physicians and are thus making a major contribution to the transfer of scientific knowledge into clinical practice (4, 5).
As the number of guidelines increases, so, too, does the amount of information that physicians must contend with. The length of the guidelines and the numerous, detailed recommendations found in them are not the only challenge. The broader the discipline, the more extensive the information that the relevant guidelines will contain, and that the individual specialist will have to master. Moreover, conflicting recommendations in the guidelines of overlapping disciplines commonly pose an additional barrier to guideline implementation (6).
Guidelines can only be successful if they are generated by proper methods and if they are of sufficiently high quality in matters of specialized expertise and content. According to the Appraisal of Guidelines for Research and Evaluation (AGREE) Collaboration, the key features of a good guideline are external validity, internal validity, and applicability in practice (7).
Internal validity means that any biases potentially affecting the recommendations were minimized in the guideline-developing process. This pertains, for example, to the mode of data acquisition and the way the data are presented (formulation of key questions and target variables; literature search and selection), the assessment of the evidence, and the use of the evidence to generate recommendations.
External validity means that the guideline actually leads to the intended improvement in patient care.
The applicability of guideline recommendations in practice is affected by further factors, including pilot testing, pre-implementation review, clarity rather than ambiguity in the formulation of recommendations, and the format of the guideline itself. Many manuals and instruments can be used for help in generating guidelines and assessing their quality (8–13). Even though it was recently shown that the quality of guidelines in various countries has not appreciably improved over time (14, 15), active clinicians still need reliable guidelines and intelligible short versions (16).
The goal of this article is to compare the main currently available instruments for assessing the quality of guidelines with respect to their comprehensiveness, orientation, and relevance to clinical practice. Rapid-assessment instruments such as the mini-checklist (MiChe) published by the authors (18) will be presented here to illustrate what practicing clinicians should look for when assessing guidelines, and what the key criteria are that should be used to tell good and bad guidelines apart without difficulty. Readers armed with this knowledge should be able to assess the utility and quality of a guideline rapidly.
In January 2010, as an in initial step in the development of the rapid-assessment instrument MiChe, a systematic search was performed in guideline directories and bibliographic databases for instruments to be used in the creation and assessment of guidelines. Assessment criteria for MiChe were selected on the basis of those most commonly found in the retrieved instruments, as well as by questioning of guideline experts (18).
To identify current assessment instruments, relevant publications that appeared from January 2010 to February 2015 were selectively sought both in PubMed (with the “related citations” function, on the basis of publications retrieved by the initial search) and on the Internet pages of guideline-creating bodies and the AGREE Working Group. All of the retrieved references were examined by a reviewer (WB or TS). All articles that concerned new instruments for methodological assessment and that were written in either English or German were used for this review.
To characterize the assessment instruments, we extracted and compared information on their potential users, their purpose, and their orientation and extent. The main criteria to be considered in the rapid assessment of a guideline were determined by an analysis of rapid-assessment instruments, with a comparison of the key questions contained in each.
Instruments for assessing guidelines
A literature search in January 2010 yielded 38 potentially relevant publications, 16 of which concerned instruments for assessing the methodological quality of guidelines (7, 19–25, e1–e8). An update of the search in February 2015 yielded a further 12 potentially relevant publications; among these, four were selected for consideration in this review—an updated version of the already existing AGREE II assessment instrument (9, 26, 27) and three new instruments (28–30) (Figure).
An assessment instrument was selected for further consideration when it was stated in the relevant publication that the instrument was intended for broad, public application, and that it could be used to assess guidelines systematically. Instruments that had been developed by individual research groups for their own use (e1, e2, e7) or that contained no assessment system or scale (e3–e6, e8) were excluded, as were two earlier versions of the assessment instruments AGREE II (9, 26, 27) and DELBI (19) (AGREE  and Helou et al. , respectively).
The main features of the ten remaining instruments and of the MiChe are shown in Table 1. Most are intended to provide a highly detailed assessment of guideline quality, as evidenced by the multiplicity of assessment criteria and the use of multilevel assessment scales. These instruments are most suitable for use by guideline developers, decision-makers in health care policy, and organizations in the health sector. In recent years, however, a number of simple instruments have been independently developed for rapid assessment by physicians, with the purpose of providing a quick and informative overview. Aside from the MiChe (18), these include the iCAHE Guideline Quality Checklist (30), the Global Rating Scale (GRS) of the AGREE Collaboration (28), and the surgeons’ checklist by Coroneos et al. (29).
Key criteria in rapid-assessment instruments
Instruments for rapid assessment were developed because it was found that guideline assessment with the more comprehensive instruments, such as AGREE II, was often too time-consuming for clinical practice. The authors of the rapid-assessment instruments stated that they intended them to be used directly by physicians in hospitals or in private practice, where little time is available and staff face high demands. They are supposed to be complementary to the complex instruments, not a replacement for them.
For this purpose, the authors of these instruments sought to formulate a few, broad key questions to provide an overall picture of the methodological quality and applicability of guidelines, as follows:
The questions and criteria in each of the rapid-assessment instruments are presented for comparison in Table 2, displayed according to their varying methodological properties (quality of guideline creation, quality of report, quality of presentation, quality of underlying evidence) and applicability. A number of criteria were found to be common to most of the instruments. From these, we derived four key questions that physicians should ask when they want to assess guidelines rapidly. The answers can be used to tell good guidelines apart from bad ones:
A more detailed presentation of the eight key criteria of the MiChe is found in the eTable.
Acquiring key information is already a major challenge for practicing physicians and is only becoming more difficult over time. Physicians must be competent to deal with an enormous variety of conditions, and the time available is usually short: as a rule, patient contacts in the ambulatory setting are brief, and hospital stays are kept as short as possible. The time available for continuing medical education is limited as well. Physicians, therefore, often make decisions based on their own and their colleagues’ personal experience, and on information sources other than clinical research reports (16, 31, 32). Despite this, physicians are now showing increasing interest in guidelines. The acceptance of guidelines seems to depend mainly on the availability of a short version and on the physician’s ability to assess the quality of guideline creation (16). To meet this growing demand, handy instruments are needed to help physicians evaluate guidelines rapidly on their own before incorporating them in their daily practice.
The literature search for the present article retrieved a number of guideline-assessing instruments that are intended for broad use. These were mainly instruments that provided a highly detailed evaluation of guideline quality and that were apparently most suitable for the use by guideline developers themselves and by specialists in guideline-development methods, in view of their comprehensiveness and their specified modes of application (e.g., by at least two raters). There were also three English-language rapid-assessment instruments designed for everyday use by physicians in clinical practice (28–30), containing up to 14 questions or criteria and an assessment scale of 7 or fewer levels. MiChe (18), an instrument of this type in German, facilitates access for German-speakers (33) and encapsulates the evaluative process in eight key criteria, with a three-level assessment scale. Through the use of essential key questions, all four of the rapid-assessment instruments let physicians make a quick judgment of the methodological quality of a guideline, and often also of its practical applicability. A further instrument called AGREE-REX (Recommendation EXcellence)—a counterpart to the established AGREE-II instrument for guideline creation—is expected to become available in 2016; it is intended to enable users to make a competent
assessment of the scientific and practical quality of guidelines (34).
Each of these rapid-assessment instruments offers a compact version of methodological guideline-development criteria that were previously worked out in detail. The compactly formulated criteria are very similar in all of the instruments; in their simplicity and their focus on essential features, they fit in well with the practicing physician’s need for efficiency.
The user must be able to judge not only the practical aspects—relevance to his or her own patients, naming of the key recommendations, readily intelligible organization of the guideline—but also the methodological quality of guideline creation. The mode of searching for evidence should be both systematic and clearly described, and there should be a clear documentation of the weight that was attached to each finding in the derivation of the recommendations. Further important and readily ascertainable criteria are a complete listing of guideline authors by name and a statement of their conflicts of interest.
By virtue of their focus on a small number of criteria, rapid-assessment instruments can be used not only by individual physicians, but also by groups of physicians (quality circles). Interested physicians can use them for a quick and easy assessment of the usefulness, for their own work in patient care, of guidelines that were developed through a long and methodologically demanding process. Clear and relevant criteria wih a simple, goal-directed rating system (“yes,” “partly,” and “no”) facilitate their use. Physicians, patients, and guideline developers all stand to benefit from the consistent implementation of guideline recommendations in practice (16, 35).
The presentation of evaluative instruments in this article may be incomplete because of the way the instruments were searched for and selected. In the updated search, we looked for additional instruments only with the “related citations” function in a single database, and in the internet portals of a few selected guideline-sponsoring bodies. To be sure that no other important evaluative instruments were missed, we cross-checked the ones retrieved by our search against those cited in three different systematic reviews of the same topic (11, 36, 37): the three reviews did not mention any additional important evaluative instruments developed explicitly for broad, public use. Instruments retrieved by our search that had not been developed explicitly for broad, public use were intentionally excluded, as were the many instruments designed as aids to guideline development.
A specific limitation of the MiChe is that it has not yet been validated. A doctoral dissertation on the topic “The Validity and Reliability of the Mini-Checklist on the Basis of AGREE II” is currently in preparation at the Department of Internal Medicine, Goethe Universität Frankfurt.
The evaluative instruments described here can only be used to assess the methodological quality of a guideline. The instruments make use of information contained in the published guideline itself to test whether certain standards have been met in its development. Guidelines containing adequate information of this type—e.g., about how the data were acquired—will generally be judged to be of high (methodological) quality. In contrast, the instruments generally do not test whether the searching strategy and the databases used were, in fact, well chosen or appropriate to the topic at hand. Thus, although proper attention to methodological standards in the creation of guidelines can be assumed to promote internal validity, some guidelines of high methodological quality may still contain individual recommendations that are not internally valid. Multiple authors have concluded that the available evaluative instruments for guidelines cannot be used to judge their clinical content or the quality of the underlying evidence (7, 11, 38).
Rapid-assessment instruments serve a complementary function to the more complex instruments such as DELBI and AGREE II and are convenient tools with which interested physicians can quickly judge the value of a guideline for themselves. Without depending on any outside help, they can check whether the guideline meets certain essential quality criteria for applicability in patient care. Physicians are now increasingly using guidelines to acquire clinical knowledge; rapid-assessment instruments could contribute additionally to this positive trend and help bring about a sustained improvement in patient care.
The mini-checklist for assessing the methodological quality of guidelines was developed as part of the dissertation of Thomas Semlitsch at the Medizinische Universität Graz, with the financial support of the Zukunftsfond (young researchers’ fund) of the scientific department of the state of Styria (Austria).
Conflict of interest statement
Prof. Kopp participated in the development of DELBI.
The remaining authors declare that no conflict of interest exists.
Manuscript submitted on 7 July 2014, revised version accepted on 13 April 2015.
Translated from the original German by Ethan Taub, M.D.
Mag. rer. nat. Thomas Semlitsch
Institut für Allgemeinmedizin und evidenzbasierte Versorgungsforschung
Medizinische Universität Graz
8036 Graz, Austria
@For eReferences please refer to:
|1.||Field MJ, Lohr KN: Clinical practice guidelines: Directions for a new program. Washington (DC): National Academies Press 1990.|
|2.||Institute of Medicine: Recommended attributes of CPGs. In: Graham R, Mancher M, Miller Wolman D, Greenfield S, Steinberg E (eds.): Clinical practice guidelines we can trust. Washington (DC): The National Academies Press 2011; 18.|
|3.||Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften (AWMF)-Ständige Kommission Leitlinien: AWMF-Regelwerk „Leitlinien“. 1th edition 2012. www.awmf.org/leitlinien/awmf-regelwerk.html (last accessed on 18 May 2015).|
|4.||Hakkennes S, Dodd K: Guideline implementation in allied health professions: a systematic review of the literature. Qual Saf Health Care 2008; 17: 296–300 CrossRef MEDLINE|
|5.||Medves J, Godfrey C, Turner C, et al.: Systematic review of practice guideline dissemination and implementation strategies for healthcare teams and team-based practice. Int J Evid Based Healthc 2010; 8: 79–89 CrossRef CrossRef MEDLINE|
|6.||Behrens T, Keil U, Heidrich J: Barriers to guideline implementation. Dtsch Arztebl Int 2011; 108: 491; author reply 3 VOLLTEXT|
|7.||The AGREE Collaboration: Development and validation of an international appraisal instrument for assessing the quality of clinical practice guidelines: the AGREE project. Qual Saf Health Care 2003; 12: 18–23 CrossRef PubMed Central|
|8.||Ansari S, Rashidian A: Guidelines for guidelines: are they up to the task? A comparative assessment of clinical practice guideline development handbooks. PLoS One 2012; 7: e49864 CrossRef MEDLINE PubMed Central|
|9.||Brouwers MC, Kho ME, Browman GP, et al.: AGREE II: advancing guideline development, reporting and evaluation in health care. CMAJ 2010; 182: E839–42 CrossRef MEDLINE PubMed Central|
|10.||Burgers J, Grol R, Klazinga N, van der Bij A, Makela M, Zaat J: [International comparison of 19 clinical guideline programs-a survey of the AGREE Collaboration]. Z Arztl Fortbild Qualitatssich 2003; 97: 81–8 MEDLINE|
|11.||Graham ID, Calder LA, Hebert PC, Carter AO, Tetroe JM: A comparison of clinical practice guideline appraisal instruments. Int J Technol Assess 2000; 16: 1024–38 CrossRef|
|12.||Qaseem A, Forland F, Macbeth F, Ollenschlager G, Phillips S, van der Wees P: Guidelines International Network: toward international standards for clinical practice guidelines. Ann Intern Med 2012; 156: 525–31 CrossRefMEDLINE|
|13.||Schunemann HJ, Wiercioch W, Etxeandia I, et al.: Guidelines 2.0: systematic development of a comprehensive checklist for a successful guideline enterprise. CMAJ 2014; 186: E123–42 CrossRef MEDLINE PubMed Central|
|14.||Kryworuchko J, Stacey D, Bai N, Graham ID: Twelve years of clinical practice guideline development, dissemination and evaluation in Canada (1994 to 2005). Implement Sci 2009; 4: 49 CrossRef MEDLINE PubMed Central|
|15.||Kung J, Miller RR, Mackowiak PA: Failure of clinical practice guidelines to meet institute of medicine standards: Two more decades of little, if any, progress. Arch Intern Med 2012; 172: 1628–33 CrossRef MEDLINE|
|16.||Vollmar H, Oemler M, Schmiemann G, et al.: Einschätzung von Hausärzten zu Leitlinien, Fortbildung und Delegation. Z Allg Med 2013; 89: 23–30.|
|17.||Turner T, Misso M, Harris C, Green S: Development of evidence-based clinical practice guidelines (CPGs): comparing approaches. Implement Sci 2008; 3: 45 CrossRefMEDLINE PubMed Central|
|18.||Semlitsch T, Jeitler K, Kopp IB, Siebenhofer A: [Development of a workable mini checklist to assess guideline quality]. Z Evid Fortbild Qual Gesundhwes 2014; 108: 299–312 CrossRef MEDLINE|
|19.||Ärztliches Zentrum für Qualität in der Medizin (ÄZQ), Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften (AWMF): Deutsches Instrument zur methodischen Leitlinien-Bewertung (DELBI). www.leitlinien.de/mdb/edocs/pdf/literatur/delbi-fassung-2005–2006-domaene-8–2008.pdf (last accessed on 18 May 2015).|
|20.||Cluzeau FA, Littlejohns P, Grimshaw JM, Feder G, Moran SE: Development and application of a generic methodology to assess the quality of clinical guidelines. Int J Qual Health Care 1999; 11: 21–8 CrossRef|
|21.||Helou A, Ollenschlager G: [Goals, possibilities and limits of quality evaluation of guidelines. A background report on the user manual of the „Methodological Quality of Guidelines“ check list]. Z Arztl Fortbild Qualitatssich 1998; 92: 361–5 MEDLINE|
|22.||Liddle J, Williamson M, Irwig L: Method for evaluating research and guideline evidence. www.health.nsw.gov.au/phb/Documents/1997–1–2.pdf (last accessed on 18 May 2015).|
|23.||Lohr KN, Field MJ: A provisional instrument for assessing clinical practice guidelines. In: Institute of Medicine, Field MJ, Lohr KN (eds.): Guidelines for clinical practice: from development to use. Washington, DC: National Academy Press 1992; 346–410 PubMed Central|
|24.||Reed GM, McLaughlin CJ, Newman R: American Psychological Association policy in context. The development and evaluation of guidelines for professional practice. Am Psychol 2002; 57: 1041–7 CrossRef MEDLINE|
|25.||Shaneyfelt TM, Mayo-Smith MF, Rothwangl J: Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. JAMA 1999; 281: 1900–5 CrossRef|
|26.||Brouwers MC, Kho ME, Browman GP, et al.: Development of the AGREE II, part 1: performance, usefulness and areas for improvement. CMAJ 2010; 182: 1045–52 CrossRef MEDLINE PubMed Central|
|27.||Brouwers MC, Kho ME, Browman GP, et al.: Development of the AGREE II, part 2: assessment of validity of items and tools to support application. CMAJ 2010; 182: E472–8 CrossRef MEDLINE PubMed Central|
|28.||Brouwers MC, Kho ME, Browman GP, et al.: The Global Rating Scale complements the AGREE II in advancing the quality of practice guidelines. J Clin Epidemiol 2012; 65: 526–34 CrossRef MEDLINE|
|29.||Coroneos CJ, Voineskos SH, Cornacchi SD, Goldsmith CH, Ignacy TA, Thoma A: Users’ guide to the surgical literature: how to evaluate clinical practice guidelines. Can J Surg 2014; 57: 280–6 CrossRef PubMed Central|
|30.||Grimmer K, Dizon JM, Milanese S, et al.: Efficient clinical evaluation of guideline quality: development and testing of a new tool. BMC Med Res Methodol 2014; 14: 63 CrossRef MEDLINE PubMed Central|
|31.||Icsezer S, Linde K: The gap between research and practice – a survey among participants in continuing medical education events. Forschende Komplementärmedizin 2008; 15: 261–7.|
|32.||McAlister FA, Graham I, Karr GW, Laupacis A: Evidence-based medicine and the practicing clinician. J Gen Intern Med 1999; 14: 236–42 CrossRef PubMed Central|
|33.||Haße W, Fischer R: Englisch in der Medizin: Der Aus- und Weiterbildung hinderlich. Dtsch Arztebl 2001; 98: A-3100–2 VOLLTEXT|
|34.||The AGREE Collaboration: Innovations to enhance the capacity of practice guidelines to improve health and health care systems: Recommendation EXcellence (AGREE-REX). www.agreetrust.org/agree-research-projects/current-research-projects/agree-rex- recommendation-excellence/ (last accessed on 9. October 2014).|
|35.||Peters-Klimm F, Natanzon I, Muller-Tasch T, et al.: [Barriers to guideline implementation and educational needs of general practitioners regarding heart failure: a qualitative study]. GMS Z med Ausbild 2012; 29: Doc46 MEDLINE PubMed Central|
|36.||Siering U, Eikermann M, Hausner E, Hoffmann-Esser W, Neugebauer EA: Appraisal tools for clinical practice guidelines: a systematic review. PLoS One 2013; 8: e82915 CrossRef MEDLINE PubMed Central|
|37.||Vlayen J, Aertgeerts B, Hannes K, Sermeus W, Ramakers D: A systematic review of appraisal tools for clinical practice guidelines: Multiple similarities and one common deficit. Int J Qual Health Care 2005; 17: 235–42 CrossRef MEDLINE|
|38.||Burls A: AGREE II-improving the quality of clinical care. Lancet 2010; 376: 1128–9 CrossRef|
|e1.||Calder L, Hébert P, Carter A, Graham I: Review of published recommendations and guidelines for the transfusion of allogeneic red blood cells and plasma. CMAJ 1997; 156: 1–8.|
|e2.||Grilli R, Magrini N, Penna A, Mura G, Liberati A: Practice guidelines developed by specialty societies: the need for a critical appraisal. Lancet 2000; 355: 103–6 CrossRef|
|e3.||Hayward RS, Wilson MC, Tunis SR, Bass EB, Guyatt G: Users’ guides to the medical literature. VIII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. JAMA 1995; 274: 570–4 CrossRef CrossRef MEDLINE|
|e4.||Hutchinson A, McIntosh A, Anderson J, Gilbert C, Field R: Developing primary care review criteria from evidence-based |
guidelines: coronary heart disease as a model. Br J Gen Pract 2003; 53: 690–6 MEDLINE PubMed Central
|e5.||Marshall JK: A critical approach to clinical practice guidelines. Can J Gastroenterol 2000; 14: 505–9 MEDLINE|
|e6.||Selker HP: Criteria for CrossRef adoption in practice of medical practice guidelines. Am J Cardiol 1993; 71: 339–41.|
|e7.||Ward JE, Grieco V: Why we need guidelines for guidelines: a study of the quality of clinical practice guidelines in Australia. Med J Aust 1996; 165: 574–6 MEDLINE|
|e8.||Woolf SH: Practice guidelines: what the family physician should know. Am Fam Physician 1995; 51: 1455–63 MEDLINE|