Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea.

Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea

Holger Schünemann, MD, PhD Professor and Chair, Dept. of Clinical Epidemiology & Biostatistics Professor of Medicine Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada Nicola Magrini, MD, Clinical Pharmacologist NHS CeVEAS NHS Centre for the Evaluation of the Effectiveness of Health Care WHO Collaborating Centre for evidence-based research synthesis and guideline development Modena, Italy Wiley Chan, MD Physician, Internal Medicine Methodologist, Kaiser Permanente National Guideline Program Kaiser Permanente Portland, Oregon, USA

Agenda 09.00 h — 09.15 h Introduction to the workshop 09.15 h — 09.45 h Overview of guideline development 09.45 h — 10.45 h Large group session “The GRADE approach: Introduction” 10.45 h — 11.00 h Break 11.00 h — 12.00 h Small group “Asking a question, specifying outcomes” 12.00 h — 13.00 h Large group session “The GRADE approach: assessing the quality of evidence” 13.00 h — 13.45 h Break 13.45 h — 15.15 h Small group work “Grading quality of evidence” 15.15 h — 15.45 h Large group session “Moving from evidence to recommendations: theoretical considerations” 15.45 h — 16.00 h Break 16.00 h — 16.30 h Small group work “Making recommendations” 16.30 h — 17.00 h Large group session “Report from small groups and practical considerations of group processes” 17.00 h — 17.30 h Large group session “Various topics (e.g. conflict of interest handling) and feedback”

History - 1967 – Founded by David Sackett - 6 chairs since - Instrumental in specialty of Clinical Epidemiology, origin of “Evidence-Based Medicine” People 45 full time and joint faculty ~ 120 associate & part time faculty; 19 emeritus ~ 180 staff ~ 200 PhD and Master students The Department of Clinical Epidemiology & Biostatistics at McMaster

What is a guideline? "Guidelines are recommendations intended to assist providers and recipients of health care and other stakeholders to make informed decisions. Recommendations may relate to clinical interventions, public health activities, or government policies." WHO 2003, 2007

Guideline development Process

Working with evidence For key recommendations: – Search for and retrieve all available evidence – Identify relevant SRs – Formally assess quality of evidence – GRADE (systematic and transparent approach)

Scoping - What is the problem? Knowledge gap? – Is a guideline the right approach? Diagnosis? – Too many cases? Too few? Variation? Treatment? – Under? Over? Variation? Something new? Screening? Quality of care? Integration of care? Other?

What evidence do you have? Utilisation data? Costs? Health outcomes? Complaints? Requests? Other?

Criteria for topic and scope (Grol) The topic concerns a relevant problem that occurs frequently and guideline development allows improvement in health or cost reduction It is possible to define the topic and focus on the most crucial aspects There is uncertainty or difference of opinion about the best care There is a need to bring together scientific knowledge and expertise or there are new insights Sufficient scientific evidence is available* There is a real opportunity to achieve consensus on the final recommendations It is possible to formulate feasible recommendations

The scope Small is beautiful (S. Hill) Who is the target user of the guideline Who it applies to What is covered? – Eg diagnosis and treatment of diabetic retinopathy Develop key questions (<20…..)

What healthcare workers want… A guideline is not a textbook or a cookbook To KNOW that the guideline is evidence based But not necessarily all of the evidence… To have it easy to use and accessible Clear recommendations (more on that later)

Group composition One systematic review (Murphy et al. 1998) Composition of panel influences recommendations – Members of a specialty are more likely to advocate techniques that involve their specialty Balanced groups – Select the appropriate group leader Necessary technical skills – including information retrieval, systematic reviewing, health economics, group facilitation, project management, writing and editing Include or have access to content experts No SR on how to obtain consultation, but logical reasons support this Up to 15 members

Group composition „Include all who are affected“ -To identify the right questions -To identify areas of suboptimal care -To identify feasibility of recommendations Consequences -Definition of Standards of Care -Ownership to improve implementation

Expertise needed in the group Medical content: health care professionals Values and preferences: patients / carers / community Support staff: ‚technical‘ professionals, e.g. epidemiologists, health economists, administrative support

Which approach? Evidence Recommendation B Class I A 1 IV C Organization  AHA  ACCP  SIGN Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease

What to do?

GRADE Working Group G rades of R ecommendation A ssessment, D evelopment and E valuation CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, AJRCCM 2006, Chest 2006, BMJ 2008 Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations - Since 2000 - Guideline developers, methodologists & clinicians from around the world

Systematic review Guideline development PICOPICO Outcome Formulate question Rate importance Critical Important Critical Not important Create evidence profile with GRADEpro Summary of findings & estimate of effect for each outcome Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes Randomization increases initial quality 1.Risk of bias 2.Inconsistency 3.Indirectness 4.Imprecision 5.Publication bias Grade down Grade up 1.Large effect 2.Dose response 3.Confounders Rate quality of evidence for each outcome Select outcomes Very low Low Moderate High Formulate recommendations: For or against (direction) Strong or conditional/weak (strength) By considering:  Quality of evidence  Balance benefits/harms  Values and preferences Revise if necessary by considering:  Resource use (cost) “We recommend using…” “We suggest using…” “We recommend against using…” “We suggest against using…” Outcomes across studies

GRADE Working Group G rades of R ecommendation A ssessment, D evelopment and E valuation CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, AJRCCM 2006, Chest 2006, BMJ 2008 International group: ACCP, AHRQ, Australian NMRC, BMJ Clinical Evidence, CC, CDC, McMaster Uni., NICE, Oxford CEBM, SIGN, UpToDate, USPSTF, WHO Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations (over 100 systems) International group of guideline developers, methodologists & clinicians from around the world (>200 contributors) – since 2000

GRADE Uptake  World Health Organization  Allergic Rhinitis in Asthma Guidelines (ARIA)  American Thoracic Society  American College of Physicians  European Respiratory Society  European Society of Thoracic Surgeons  British Medical Journal  Infectious Disease Society of America  American College of Chest Physicians  UpToDate®  National Institutes of Health and Clinical Excellence (NICE)  Scottish Intercollegiate Guideline Network (SIGN)  Cochrane Collaboration  Infectious Disease Society of America  Clinical Evidence  Agency for Health Care Research and Quality (AHRQ)  Partner of GIN  Over 60 (major) organizations

Evidence based healthcare decisions Research evidence Population/societal values and preferences (Clinical) state and circumstances Expertise Haynes et al. 2002

Confidence in evidence There always is evidence – “When there is a question there is evidence” Better research  greater confidence in the evidence and decisions

Hierarchy of evidence based on quality STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion BIAS

Explain the following? Confounding, effect modification & ext. validity Concealment of randomization Blinding (who is blinded in a double blinded study?) Intention to treat analysis and its correct application P-values and confidence intervals “Everything should be made as simple as possible but not simpler.”

BMJ 2003 BMJ, 2003

BMJ 2003 Relative risk reduction: ….> 99.9 % (1/100,000) U.S. Parachute Association reported 821 injuries and 18 deaths out of 2.2 million jumps in 2007

Simple hierarchies are (too) simplistic STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations BIAS Expert Opinion Schünemann & Bone, 2003

Why bother about grading? People always draw conclusions about: – Quality of evidence – Strength of recommendation Systematic and explicit approaches can help: – Protect against errors – Resolve disagreements – Facilitate critical appraisal – Communicate information

Getting from evidence to recommendations - GRADE Recommendations are judgments: – Quality of evidence – Trade off between benefits and harms – Values and preferences – Resource use But judgments need to be based on the best available evidence and transparent

Desirable outcomes – lower mortality – reduced hospital stay – reduced duration of disease – reduced resource expenditure Undesirable outcomes – adverse reactions – the development of resistance – costs of treatment Every decision comes with desirable and undesirable consequences  Developing recommendations must include a consideration of desirable and undesirable outcomes Choosing outcomes

GRADE: recommendation – quality of evidence Clear separation: 1) 4 categories of quality of evidence:  (High),   (Moderate),   (Low),   (Very low) ? – methodological quality of evidence – likelihood of bias – by outcome and across outcomes 2) Recommendation: 2 grades – conditional (aka weak) or strong (for or against an intervention)? – Balance of benefits and downsides, values and preferences, resource use and quality of evidence *www.GradeWorking-Group.org

GRADE Quality of Evidence In the context of making recommendations: The quality of evidence reflects the extent of our confidence that the estimates of an effect are adequate to support a particular decision or recommendation.

Likelihood of and confidence in an outcome

Determinants of quality RCTs  observational studies   5 factors that can lower quality 1.limitations in detailed design and execution (risk of bias criteria) 2.Inconsistency (or heterogeneity) 3.Indirectness (PICO and applicability) 4.Imprecision (number of events and confidence intervals) 5.Publication bias 3 factors can increase quality 1.large magnitude of effect 2.all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed 3.dose-response gradient

GRADE evidence profile 39

Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” Strong or conditional

Implications of a strong recommendation Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action Policy makers: The recommendation can be adapted as a policy in most situations

Implications of a conditional/weak recommendation Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Policy makers: There is a need for substantial debate and involvement of stakeholders

Determinants of the strength of recommendation

Asking questions and choosing outcomes Holger

Outline Type of questions Framing a foreground question Choosing outcomes Relative importance of outcomes 49

Guidelines and questions Guidelines are a way of answering questions about clinical, communication, organisational or policy interventions, in the hope of improving health care or health policy. It is therefore helpful to structure a guideline in terms of answerable questions. WHO Guideline Handbook, 2008

Types of questions Background Questions Definition: What is COPD? Mechanism: What is the mechanism of action of mucolytic therapy? Foreground Questions Efficacy : In patients with COPD, does mucolytic therapy improve survival?

Framing a foreground question PICOPICO

Population: Intervention: Comparison: Outcomes:

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir

What are examples of: Background questions Foreground questions Population: Intervention: Comparison: Outcomes: 55

Framing a foreground question Population: Avian Flu/influenza A (H5N1) patients Intervention: Oseltamivir (or Zanamivir) Comparison: No pharmacological intervention Outcomes: Mortality, hospitalizations, resource use, adverse outcomes, antimicrobial resistance Schunemann, Hill et al., The Lancet ID, 2007

Choosing outcomes Every decision comes with desirable and undesirable consequences  Developing recommendations must include a consideration of desirable and undesirable outcomes  Outcomes should be patient important outcomes.

desirable outcomes – lower mortality – reduced hospital stay – reduced duration of disease – reduced resource expenditure undesirable outcomes – adverse reactions – the development of resistance – costs of treatment Choosing outcomes

 What if what is important is not measured?  What if what is measured is not important?  How do we make sure we’ve covered all important outcomes? Choosing outcomes

Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation Relative importance vary across populations Relative importance may vary across patient groups within the same population When considered critical - evaluate Relative importance of outcomes

2 Critical for decision making Important, but not critical for decision making Of low importance 5 6 7 8 9 3 4 1 Relative importance of outcomes

Flatulence 2 Hierarchy of outcomes according to their importance to assess the effect of phosphate lowering drugs in patients with renal failure and hyperphosphatemia Importance of endpoints Critical for decision making Important, but not critical for decision making Of low importance 5 Pain due to soft tissue 6 calcification / function Fractures 7 Myocardial infarction 8 Mortality 9 3 4 1

Hierarchy of evidence based on quality STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion BIAS

Healthcare problem recommendation “Healthy people” “Herd immunity” “Long term perspective” “Disease perception” “Lots of other things”

Determinants of quality RCTs  observational studies   5 factors that can lower quality 1.limitations in detailed design and execution (risk of bias criteria) 2.Inconsistency (or heterogeneity) 3.Indirectness (PICO and applicability) 4.Imprecision (number of events and confidence intervals) 5.Publication bias 3 factors can increase quality 1.large magnitude of effect 2.all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed 3.dose-response gradient

1. Design and Execution/Risk of Bias Limitation in observational studiesExplanations Failure to develop and apply appropriate eligibility criteria (inclusion of control population) under- or over-matching in case-control studies selection of exposed and unexposed in cohort studies from different populations Flawed measurement of both exposure and outcome differences in measurement of exposure (e.g. recall bias in case- control studies) differential surveillance for outcome in exposed and unexposed in cohort studies Failure to adequately control confounding failure of accurate measurement of all known prognostic factors failure to match for prognostic factors and/or adjustment in statistical analysis Incomplete or inadequately short follow-up

1. Design and Execution/Risk of Bias Limitations in RCTs lack of concealment intention to treat principle violated inadequate blinding loss to follow-up early stopping for benefit selective outcome reporting

Design and Execution/RoB From Cates, CDSR 2008

Design and Execution/RoB Overall judgment required

Yes No Don’t know or undecided Who believes the risk of bias is of concern?

Detailed study design and execution Mortality, cancer and anticoagulation Akl E, Barba M, Rohilla S, Terrenato I, Sperati F, Schünemann HJ. “Anticoagulation for the long term treatment of venous thromboembolism in patients with cancer”. Cochrane Database Syst Rev. 2008 Apr 16;(2):CD006650.

Five trials

Yes No Don’t know or undecided Who believes the risk of bias is of concern?

2. Inconsistency of results (Heterogeneity) if inconsistency, look for explanation – patients, intervention, comparator, outcome if unexplained inconsistency lower quality

Reminders for immunization uptake

Non-steroidal drug use and risk of pancreatic cancer Capurso G, Schünemann HJ, Terrenato I, Moretti A, Koch M, Muti P, Capurso L, Delle Fave G. Meta-analysis: the use of non-steroidal anti-inflammatory drugs and pancreatic cancer risk for different exposure categories. Aliment Pharmacol Ther. 2007 Oct 15;26(8):1089-99.

Inconsistency I 2 P-value Overlap in CI Difference in point estimates

3. Directness of Evidence generalizability, transferability, applicability differences in – populations/patients (HIC – L/MIC, patients with HIV – all patients) – interventions (new fluroquinolones - old) – comparator appropriate (newer antibx – old) – outcomes (important – surrogate; signs and symptoms – mortality) indirect comparisons – interested in A versus B – have A versus C and B versus C – Cryo + antibiotics versus no intervention versus Cryo versus no intervention

4. Publication Bias Should always be suspected – Only small “positive” studies – For profit interest – Various methods to evaluate – none perfect, but clearly a problem

Egger M, Smith DS. BMJ 1995;310:752-54 83 I.V. Mg in acute myocardial infarction Publication bias Meta-analysis Yusuf S.Circulation 1993 ISIS-4 Lancet 1995

Egger M, Cochrane Colloquium Lyon 2001 84 Funnel plot Standard Error Odds ratio 0.10.313 3 2 1 0 100.6 Symmetrical: No publication bias

Egger M, Cochrane Colloquium Lyon 2001 85 Funnel plot Standard Error Odds ratio 0.10.313 3 2 1 0 100.6 Asymmetrical: Publication bias? 0.4

5. Imprecision Small sample size – small number of events Wide confidence intervals – uncertainty about magnitude of effect

Example: Immunization in children

For systematic reviews If the 95% CI excludes a relative risk (RR) of 1.0 and the total number of events or patients exceeds the OIS criterion, precision is adequate. If the 95% CI includes appreciable benefit or harm (we suggest a RR of under 0.75 or over 1.25 as a rough guide) rating down for imprecision may be appropriate even if OIS criteria are met.

Optimal information size We suggest the following: if the total number of patients included in a systematic review is less than the number of patients generated by a conventional sample size calculation for a single adequately powered trial, consider rating down for imprecision. Authors have referred to this threshold as the “optimal information size” (OIS)

025.0%

2.0 0.50 Ischemic stroke point estimate and confidence interval Figure 1, Rating down for imprecision in guidelines: Thresholds are key Favors Intervention Favors Control Risk difference (%) Threshold if side effects and toxicity appreciable, NNT = 100. Confidence interval crosses threshold, rate down for imprecision Threshold if side effects, toxicity and cost minimal, NNT = 200. Entire confidence interval to left of threshold, do not rate down for imprecision

Figure 2: Corticosteroids to reduce hospital mortality in septic shock

Figure 4: Optimal information size given alpha of 0.05 and beta of 0.2 for varying control event rates and relative risks For any chosen line, evidence meets optimal information size criterion if sample size above the line

What can raise quality? 1. large magnitude can upgrade (RRR 50%/RR 2) – very large two levels (RRR 80%/RR 5) – criteria everyone used to do badly almost everyone does well – parachutes to prevent death when jumping from airplanes

BMJ 2003 BMJ, 2003

BMJ 2003 Relative risk reduction: ….> 99.9 % (1/100,000) U.S. Parachute Association reported 821 injuries and 18 deaths out of 2.2 million jumps in 2007

Reminders for immunization uptake

What can raise quality? 2. dose response relation – (higher INR – increased bleeding) – childhood lymphoblastic leukemia risk for CNS malignancies 15 years after cranial irradiation no radiation: 1% (95% CI 0% to 2.1%) 12 Gy: 1.6% (95% CI 0% to 3.4%) 18 Gy: 3.3% (95% CI 0.9% to 5.6%) 3. all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed

All plausible residual confounding would result in an overestimate of effect  Hypoglycaemic drug phenformin causes lactic acidosis  The related agent metformin is under suspicion for the same toxicity.  Large observational studies have failed to demonstrate an association – Clinicians would be more alert to lactic acidosis in the presence of the agent Vaccine – adverse effects

Schünemann et al. JECH 2010

Quality assessment criteria

Overall quality of a body of evidence The quality of evidence reflects the extent of our confidence that the estimates of an effect are adequate to support a particular decision or recommendation. Guideline developers must specify and determine importance of all relevant outcomes Overall quality of evidence is based on the lowest quality of all critical outcomes

Meta-analyses of several critical and important outcomes (one PICO) Better Relative Risk Worse 0.50.7511.251.5 High  Low   Due to imprecision and risk of bias Moderate   Due to imprecision High  Moderate   based on critical outcomes Low  

Meta-analyses of several critical outcomes (one PICO) Better Relative Risk Worse 0.50.7511.251.5 Threshold of acceptable harm for strong recommendation based on sure benefit in mortality and stroke High  Moderate   Due to imprecision High 

Meta-analyses of several critical outcomes (one PICO) Better Relative Risk Worse 0.50.7511.251.5 High  Moderate   due to risk of bias High  Moderate   Threshold of acceptable harm for strong recommendation based on sure benefit in mortality and stroke

Interpretation of grades of evidence  /A/High: Further research is very unlikely to change confidence in the estimate of effect.   /B/Moderate: Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate.   /C/Low: Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate.   /D/Very low: We have very little confidence in the effect estimate: Any estimate of effect is very uncertain.

Evidence Profiles/Summaries

Creating a new GRADEpro file

Profile groups Profiles

Managing outcomes

Importing a RevMan 5 file of a systematic review Imported data from RevMan 5 file: outcomes meta-analyses results bibliographic information

Questions

Content Quality of evidence Going from evidence to recommendations Nicola

Healthcare problem recommendation

Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” Strong or conditional

Determinants of the strength of recommendation

Trends in guideline production (AHA guidelines, Tricoci JAMA 2009) Recommendations are increasing in size with every update (+48% form first version) Levels (quality of evidence: only a minority of recommendations are based in good evidence (11%) and half (48%) on low quality Recommendations with level of evidence A are mostly concentrated in class I (strong recommendation or useful and effective), but only 245 of 1305 class I recommendations have level of evidence A (median, 19%)

Una rivalutazione delle linee-guida (1/3) … in ACC/AHA guidelines with at least 1 revision, the number of recommendations increased 48% from the first guideline to the most recent version. If there is a main message in such guidelines, it is likely to be lost in the minutiae. Within a guideline document, individual recommendations also need to be prioritized. Finally, guidelines need flexibility. Recommendations should vary based on patient comorbidities, the health care setting, and patient values and preferences. Physicians would be better off making clinical decisions based on valid primary data. Shaneyfelt TM, Centor RM. Reassessment of clinical practice guidelines JAMA 2009

How to improve transparency in going from evidence to recomendations

Evidence Graded Recommend ation (IA, IB, etc…. Strong or weak recommendation Quality assessment criteria Ratings of outcomes Quality assessment criteria Quality of evidence: estimates of benefits & harms and risk of bias, directness, … Risk-benefit ratio, considering cost, access and feasibility Evidence Quality assessment criteria GRADE’s framework/expliciteness

Categories of recommendations Strong recommendation : the panel is confident that the desirable effects of adherence to a recommendation outweigh the undesirable effects. Weak recommendation : the panel concludes that the desirable effects of adherence to a recommendation probably outweigh the undesirable effects, but is not confident. Recommend   Suggest   Although the degree of confidence is a continuum, we suggest using two categories: strong and weak. GRADE Working Group

Implications / translations of strong and weak Strong recommendation Just do it Virtually all well informed individuals would want the intervention and only a small proportion would not Most individuals should receive the intervention Use of the intervention according to the guideline could be used as a quality criterion or performance indicator Weak recommendation Examine the evidence yourself The majority of well informed individuals would want the intervention, but a substantial proportion would not Many but not all individuals should receive the intervention The intervention is not a candidate for a quality criterion or performance indicator. GRADE Working Group

Examples of recommendations using GRADE Examples of flexibility and transparency

Methods – WHO Rapid Advice Guidelines for Avian Flu  Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR  Group composition (including panel of 13 voting members):  clinicians who treated influenza A(H5N1) patients  infectious disease experts  basic scientists  public health officers  methodologists  Independent scientific reviewers:  Identified systematic reviews, recent RCTs, case series, animal studies related to H5N1 infection

Oseltamivir for Avian Flu Summary of findings: No clinical trial of oseltamivir for treatment of H5N1 patients. 4 systematic reviews and health technology assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. – Hospitalization: OR 0.22 (0.02 – 2.16) – Pneumonia: OR 0.15 (0.03 - 0.69) 3 published case series. Many in vitro and animal studies. No alternative that was more promising at present. Cost: 40$ per treatment course

From evidence to recommendation

Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al. The Lancet ID, 2007

For opioid agonist maintenance treatment, most patients should be advised to use methadone in adequate doses in preference to buprenorphine. – Strength of recommendation – Strong – Quality of evidence – High WHO Guidelines for the Psychosocially Assisted Pharmacological Treatment of Opioid Dependence (2009) On average, methadone maintenance doses should be in the range of 60–120 mg per day. –Strength of recommendation – Strong –Quality of evidence – Low

Values and preferences (from Antithrombotic therapy) Stroke guideline: patients with TIA clopidogrel over aspirin (Grade 2B). Underlying values and preferences: This recommendation to use clopidogrel over aspirin places a relatively high value on a small absolute risk reduction in stroke rates, and a relatively low value on minimizing drug expenditures.

Values and preferences (from Antithrombotic therapy) peripheral vascular disease: aspirin be used instead of clopidogrel (Grade 2A). Underlying values and preferences: This recommendation places a relatively high value on avoiding large expenditures to achieve small reductions in vascular events.

Criteria for hetherogeneity?

Criteria for imprecision?

implications of strong and weak recommendations … (strong ones have much clearer implications so we tried to show and conceptualise the continuum of strength)

Like P value, also strength is a continuum

Strength of recomendations and expected use Strenght Definition and implications Expected use Strong positive The drugs/interventions should offered to the vast majority of patients and could be used as an indicator of good quality of care. It doesn’t mean however all patients should receive them. Always Strong negative It should not be used neither routinely nor for a subgroup. Only in few very selected cases it should be documented its use since the benefit/risk balance is negative and potential alternative are preferable. Never

Strength of recomendations and indicators Strenght Definition and implications Expected use Strong positive The drugs/interventions should offered to the vast majority of patients and could be used as an indicator of good quality of care. It doesn’t mean however all patients should receive … Almost always Strong negative It should not be used neither routinely nor for a subgroup. Only in few very selected cases it should be documented its use since the benefit/risk balance is negative and potential alternative are preferable. Exceptional cases

Strength of recomendations and expected use Strenght Definition and implications Expected use Strong positive The drugs/interventions should offered to the vast majority of patients and could be used as an indicator of good quality of care. It doesn’t mean however all patients should receive … almost always Strong negative It should not be used neither routinely nor for a subgroup. Only in few very selected cases it should be documented its use since the benefit/risk balance is negative and potential alternative are preferable. exceptional cases When a recomendation can’t be strong, can we formulate a weak recomendation in a way that will be clearly applicable and monitored?

Strength of recomendations and expected use Strenght Definition and implications Expected use Strong positive The drugs/interventions should offered to the vast majority of patients and could be used as an indicator of good quality of care. It doesn’t mean however all patients should receive … almost always Weak positive It has the wider range of uncertainty since it could mean only for a minority of patients (30%) or for a good proportion of them (50- 60%). It is necessary to inform patients of the expected benefits and risks (and their magnitude), explore patients values and discuss potential alternative treatments. 30-60% Weak negative In selected cases or a defined minority. The decision should go along with a detailed information to patient of the benefit risk (magnitude), patients valueds and expectations and disucss potential alternative treatments. 5-30% Strong negative It should not be used neither routinely nor for a subgroup. Only in few very selected cases it should be documented its use since the benefit/risk balance is negative and potential alternative are preferable. exceptional cases

Determinants of strength … in a flexible system (GRADE) BMJ, 2008

GRADE Step 2: risk benefit profile, values and preferences, costs (1/3)

Balancing benefits and downsides ↑ Allergic reactions ↑ Local skin reactions ↑ Nausea ↑ Resources ↑ QoL ↓ Death ↓ Morbidity ↑ herd immunity Conditional Strong ForAgainst

Implications of a strong recommendation Policy makers: The recommendation can be adapted as a policy in most situations Patients: Most people in this situation would want the recommended course of action and only a small proportion would not Clinicians: Most patients should receive the recommended course of action

Implications of a conditional recommendation Policy makers: There is a need for substantial debate and involvement of stakeholders Patients: The majority of people in this situation would want the recommended course of action, but many would not Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making

Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick.

Complex data & decisions: yes/no?

Recommendation -The Guidelines Group recommends that fluoroquinolones are / not used in the treatment of all patients with MDR (Strong(conditional) recommendation/ low(moderate, high) grade of evidence)

Issues in guideline development for immunization Causation versus effects of intervention – Causation not equivalent to efficacy of interventions – Bradford Hill Nearly half a century old – tablet from the mountain? Harms caused by interventions – Assumption is that removal of vaccine (or no exposure) leads to NO adverse effects How confident can one be that removal of the exposure is effective in preventing disease? – Whether immunization or environmental factors: will depend on the intervention to remove exposure

Current state of recommendations 177

Reviewed 7527 recommendations – 1275 randomly selected Inconsistency across/within 31.6% did not recommendations clearly – Most of them not written as executable actions 52.7% did not indicated strength 178

Recommendation The Guideline Group recommends rapid DST testing for resistance to INH and RIF or RIF alone over conventional testing or no testing at the time of diagnosis of TB (conditional,   /low quality evidence). Values and preferences: A high value was placed on outcomes such as preventing death and transmission of MDR as a result of delayed diagnosis as well as avoiding spending resources.

Group composition Group composition might affect recommendation Common principle: include all affected by the recommendations (  multi-disciplinary groups incl. patients/carers) – Industry? Keep a manageable size

The Process: How to make it constructive? Group members are heterogeneous and might have different objectives Chair facilitates rather than leads the group Common understanding of goal, tasks and ground rules Similar level of required knowhow and skills Sufficient technical support

Balanced participation and formal agreement Key task of chair Formal consensus processes Delphi Method Nominal group process Voting

Group processes

How to present controversies Lay out the controversies Describe the evidence Ask members to focus on the agreed upon evidence and the factors leading to a decision Ask whether there still is disagreement Vote – Make voting explicit and transparent (ways of doing this to come tomorrow)

Conclusions - Process Success depends on strong chair(s), training of group, good facilitation and technical support – Clinical and methods co-chairs Formal consensus developing methods might support agreement on recommendations – Voting represents forced consensus Guideline development will require sufficient resources.

GRADE Grid

Systematic review Guideline development PICOPICO Outcome Formulate question Rate importance Critical Important Critical Not important Create evidence profile with GRADEpro Summary of findings & estimate of effect for each outcome Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes Randomization increases initial quality 1.Risk of bias 2.Inconsistency 3.Indirectness 4.Imprecision 5.Publication bias Grade down Grade up 1.Large effect 2.Dose response 3.Confounders Rate quality of evidence for each outcome Select outcomes Very low Low Moderate High Formulate recommendations: For or against (direction) Strong or conditional (strength) By considering:  Quality of evidence  Balance benefits/harms  Values and preferences (Revise by considering:)  Resource use (cost) “We recommend using…/should” “We suggest using…/might” “We recommend against using…/might not” “We suggest against using…/should not” Outcomes across studies

Conflict of Interest From Boyd 2007

The Issues Clinical practice guidelines ideally provide clinicians with impartial, evidence-based recommendations Clinical practice guidelines should be free from outside biases and competing interests Two possible sources of bias: – Commercial sponsorship – Conflicts of interest Boyd and Bero, 2006

Case scenario Would you be concerned about impartialness related to development of recommendations? 1) A statistician who consulted for a lung function device company that has no pharmacological sector is involved in conducting a meta-analysis informing treatment for pneumonia guidelines 2) A researcher is the owner or major shareholder of a company that produces a device about which a recommendation will be formulated by a panel

Case scenario Would you be concerned about impartialness related to development of recommendations? 1)A clinician-researcher at the Associate Professor level has published in an area about which recommendations will be made? She has not received any payment from for-profit organizations. What if her work consistently indicates an effect of the intervention? What if there is limited resources to develop the guidelines and this individual volunteers the time? What role should she have on the panel?

COI A) A divergence between an individual’s private interests and his or her professional obligations such that an independent observer might reasonably question whether the individual’s professional actions or decisions are motivated by personal gain, such as financial, academic advancement, clinical revenue streams or community standing. Schunemann et al., 2008

COI B) A financial or intellectual relationship that may impact an individual’s ability to approach a scientific question with an open mind. Examples: i.)All financial relationships including employment, consultancies, known stock holdings or holdings in a sector fund relevant to the subject matter, honoraria, in kind gifts or benefits, …. ii.)Personal, intellectual or academic relationships that interfere … Schunemann et al., 2008

“A conflict of interest is a set of circumstances that creates a risk that professional judgment or actions regarding a primary interest will be unduly influenced by a secondary interest.” IOM, Lo and Fields editors, National Academic Press, Institute of MEdicine, 2009

Managing conflicts of interest Few explicit descriptions or management criteria Possible range: – Disclosure – Management (reducing equity, eliminating income) – Recusal from voting – Excused from committee Independent groups – Evidence review, methodologist leading panel

Review and Enforcement Few groups have enforcement power beyond dismissal from committee if interests not disclosed Review may be single or multi-step process Appeal process?

What is a significant conflict of interest? Few precise specifications of disqualifying or prohibited relationships 4 models: – No relationships a priori disqualifying (most) – No ties, with exceptions (AAMC: “rebuttable presumption clause”) – Disqualifying relationships: No stock ownership or income >$10,000 or other factors (CARI; AACP) – Alternative model: no financial ties on decision-making panel (NIH Consensus Dev. Program)

Commercial Sponsorship “The perception that a commercial entity, especially pharmaceutical or medical device companies, influenced the conclusions and recommendations of a practice guideline committee could undermine the credibility of both the guidelines and the group that produced it.” Cochrane Collaboration 2004

Risks of commercial sponsorship Abundant evidence regarding commercial sponsorship of research: – Commercially sponsored studies more likely to support the sponsors’ products than non- commercially sponsored studies Across study types and medical specialties

Disclosure of funding sources Disclosure highly variable; often inadequate 50-67% of practice guidelines studies did not report who was involved in the guideline development or the funding sources (Grilli 2000; Taylor 2005)

Disclosures of guideline committee members 3 models of disclosure: – Minimal requests: “Please disclose all related financial interests …” – Structured requests: “Describe consulting, employment, etc.” – Checklists Variation along many dimensions: – When and how often to disclose? – How much detail? – Period covered by disclosure? – Who covered?

Conclusions  WHO guidelines should be based on the best available evidence to be evidence based  GRADE is the approach used by WHO and gaining acceptance internationally  combines what is known in health research methodology and provides a structured approach to improve communication  Does not avoid judgments but provides framework  Criteria for evidence assessment across questions and outcomes  Criteria for moving from evidence to recommendations  Transparent, systematic  four categories of quality of evidence  two grades for strength of recommendations  Transparency in decision making and judgments is key

Feedback and discussion

Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea.

Similar presentations

Presentation on theme: "Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea.

Similar presentations

Presentation on theme: "Guideline development through GRADE August 28, 2011 GIN 2011, Seoul, Korea."— Presentation transcript:

Similar presentations

About project

Feedback