Summary of Findings & Assessment of Quality of Evidence: Grade Workshop Sunday, October 17, 2010 0900 to 1700 Introduction.

Slides:



Advertisements
Similar presentations
High Resolution studies
Advertisements

Understanding heterogeneity in systematic reviews and met-analysis meta-analysis generates a single best estimate of effectmeta-analysis generates a single.
Helical CT Screening for Lung Cancer at Advanced Radiology Consultants
Client Assessment and Other New Uses of Reliability Will G Hopkins Physiology and Physical Education University of Otago, Dunedin NZ Reliability: the Essentials.
Evaluating Provider Reliability in Risk-aware Grid Brokering Iain Gourlay.
In the name of GOD In the name of GOD.
The SCPS Professional Growth System
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Comparator Selection in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
External validity: to what populations do our study results apply?
Testing Hypotheses About Proportions
Conclusion Epidemiology and what matters most
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Patient Survey Results 2013 Nicki Mott. Patient Survey 2013 Patient Survey conducted by IPOS Mori by posting questionnaires to random patients in the.
 may be efective in preventing SGA birth in women at high risk of preeclampsia although the effect size is small. (c)
Holger Schünemann, MD, PhD From Evidence to EMS Practice: Building the National Model Washington, September 4,
Chicago 2014 Peter Morley, Eddy Lang E3, GRADE expert Incorporating lower levels of evidence.
Critically Evaluating the Evidence: Tools for Appraisal Elizabeth A. Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant.
Summarising findings about the likely impacts of options Judgements about the quality of evidence Preparing summary of findings tables Plain language summaries.
Grading of Recommendations Assessment, Development and Evaluation (GRADE) Methodology.
Journal Club Alcohol and Health: Current Evidence January–February 2007.
Felix I. Zemel, MPH DrPH Student Tufts University School of Medicine.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
Making all research results publically available: the cry of systematic reviewers.
Are the results valid? Was the validity of the included studies appraised?
Using GRADEpro to create Evidence Profiles and Summary of Findings Tables Wednesday 19 January to 1330 (PT) Nancy Santesso McMaster University.
Grading of Recommendations Assessment, Development, and Evaluation (GRADE) Working Group
Multiple Choice Questions for discussion
Society of General International Medicine 32 nd Annual Meeting, May 14 th 2009 Elie A. Akl, MD, MPH, PhD David Atkins, MD, MPH Eric Bass, MD, MPH Yngve.
Holger Schünemann, MD, PhD Chair and Professor, Department of Clinical Epidemiology & Biostatistics Professor of Medicine Michael Gent Chair in Healthcare.
Holger Schünemann, MD, PhD Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research Professor of Clinical Epidemiology,
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.
Brief summary of the GRADE framework Holger Schünemann, MD, PhD Chair and Professor, Department of Clinical Epidemiology & Biostatistics Professor of Medicine.
GRADE example application of Jan Brożek. My potential conflicts of interest GRADE working group Cochrane Collaboration.
Plan GRADE backgroundGRADE background confidence in estimates (quality of evidence)confidence in estimates (quality of evidence) evidence profilesevidence.
Deciding how much confidence to place in a systematic review What do we mean by confidence in a systematic review and in an estimate of effect? How should.
Systematic Review Module 11: Grading Strength of Evidence Interactive Quiz Kathleen N. Lohr, PhD Distinguished Fellow RTI International.
Two questions in grading recommendations Are you sure?Are you sure? –Yes: Grade 1 –No: Grade 2 What is the methodological quality of the underlying evidenceWhat.
WHO GUIDANCE FOR THE DEVELOPMENT OF EVIDENCE-BASED VACCINE RELATED RECOMMENDATIONS August 2011.
Sifting through the evidence Sarah Fradsham. Types of Evidence Primary Literature Observational studies Case Report Case Series Case Control Study Cohort.
Dallas 2015 TFQO: Vinay Nadkarni #375 EVREV 1: Vinay Nadkarni #375 EVREV 1: Dave Kloeck #126 Taskforce: Paeds Paed 424: Vasopressors in Paediatric cardiac.
Developing evidence-based guidelines at WHO. Evidence-based guidelines at WHO | January 17, |2 |
Copyright © 2011 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Systematic Review and Meta-Analysis.
Lab 8: Types of Studies and Study Designs Lab Workbook (pp. 37 – 40)
GDG Meeting Wednesday November 9, :30 – 11:30 am.
Dallas 2015 TFQO: Hiroshi Nonogi #254 EVREVs: Hiroshi Nonogi #254 Tony Scott #138 Taskforce: ACS Fibrinolytic and immediate PCI for STEMI 882.
Considerations in grading a recommendation methodological quality of evidencemethodological quality of evidence likelihood of biaslikelihood of bias trade-off.
GRADE Grading of Recommendations Assessment, Development and Evaluation British Association of Dermatologists April 2014.
Methodological Issues in Implantable Medical Device(IMDs) Studies Abdallah ABOUIHIA Senior Statistician, Medtronic.
Developing your research question Fiona Alderdice and Mike Clarke.
Clinical Practice Guidelines: Can we fix Babel? Eddy Lang Department Chair, Emergency Alberta Health Services Associate Professor University of Calgary.
Webinar May 25th METHYLPHENIDATE FOR CHILDREN AND ADOLESCENTS WITH ATTENTION DEFICIT HYPERACTIVITY DISORDER (ADHD)
Systematic review of the potential adverse effects of caffeine consumption in healthy adults, pregnant women, adolescents, and children: Cardiovascular.
Approach to guideline development
for Overall Prognosis Workshop Cochrane Colloquium, Seoul
Why this talk? you will be seeing a lot of GRADE
Conflicts of interest Major role in development of GRADE
Dr. Connie Weaver Department of Nutrition Science Purdue University
Milton Tenenbein, MD University of Manitoba
Overview of the GRADE approach – selected slides
Chapter 7 The Hierarchy of Evidence
WHO Guideline development
Summary of Findings tables in Cochrane reviews
Plan GRADE background two steps evidence profiles
Interpreting Basic Statistics
EAST GRADE course 2019 Introduction to Meta-Analysis
Systematic Reviews and Meta-Analysis -Part 2-
Presentation transcript:

Summary of Findings & Assessment of Quality of Evidence: Grade Workshop Sunday, October 17, to 1700 Introduction

Introduction to facilitators Michelle Kho Jan Brozek Nancy Santesso Holger Schunemann Ingvil von Mehren Sæterdal

Agenda Mix of presentations, interactive sessions, hands-on work and small group discussions

Systematic review process

Risk of Bias

Meta-analysis

Sensitivity analyses High versus lower protein diets (studies with <20% losses to follow-up) Change in Systolic blood pressure (mmHg)

Heterogeneity

Subgroup analysis

Funnel Plot Medline Search Strategy for RCTs and Reviews diet, protein-restricted/ 2 diet, carbohydrate-restricted/ 3 1 or 2 4 diet fads/ 5 (carbohydrate* or protein*).ti,ab. 6 4 and 5 7 exp dietary proteins/ 8 dietary carbohydrates/ 9 (diet* or intake*).ti,ab. 10 (high* or increas* or rich or low* or restrict* or decreas* or reduc*).ti,ab. 11 (7 or 8) and 9 and 10 12((carbohydrate* or protein*) adj3 (high* or increas* or rich or low* or restrict* or decreas* or reduc*)).ti,ab and or 6 or 11 or randomized controlled trial.pt. 16 controlled clinical trial.pt. 17 randomized.ab. 18 placebo.ab. 19 clinical trials as topic.sh. 20 randomly.ab. 21 trial.ti. 22 or/ humans.sh and and 24

Systematic review process

Chapter 11: Presenting results and Summary of Findings Tables Chapter 12: Interpreting results and drawing conclusions Cochrane Handbook

Overview: Interpreting results of a review and GRADE how does GRADE fit into the process of moving from results to conclusions in systematic reviews what are the basic principles behind GRADE

Consider the following examples of moving from results to conclusions How would you interpret the results of the meta-analyses and conclusions made by the authors?

Authors’ conclusions Short term beneficial effects were found for fasting for 7 to 10 days followed by a vegetarian diet when compared to ordinary diet.

The pooled SMD for pain reduction comparing glucosamine to placebo was 0.61, which represents a moderate clinically significant treatment benefit in favour of glucosamine

What information do you think would increase or decrease your confidence in these results? What information do you think would indicate that more research is or is not necessary? Work with your neighbor and discuss for 5 mins

To make conclusions consider.... likelihood of effect and confidence in that effect

To make conclusions consider.... Likelihood/magnitude of effect – Mood will drop by 5 points on a scale of 0 to 10 when on a high protein diet. quality of the evidence – Low: This estimate will likely change with more research.

Summary of Findings Table a summary of the key findings from the systematic review for users a transparent aid and record of the authors’ interpretation of the results to make conclusions

Weighing the criteria for overall quality of evidence In fact in this example, – Allocation concealment is unclear in one of the studies – Only three of five studies measured major bleeding - a primary outcome in anticoagulation studies – suggesting selective outcome reporting – The confidence intervals include potential for harm or no harm I might say that my confidence in the results is “low” and that more research is likely to change the results

The pooled SMD for pain reduction comparing glucosamine to placebo was 0.61, which represents a moderate clinically significant treatment benefit in favour of glucosamine

Likelihood of and confidence in an outcome

Quality of evidence across studies for an outcome  HighFurther research is very unlikely to change our confidence in the estimate of effect or accuracy.   ModerateFurther research is likely to have an important impact on our confidence in the estimate of effect or accuracy and may change the estimate.   LowFurther research is very likely to have an important impact on our confidence in the estimate of effect or accuracy and is likely to change the estimate.   Very lowAny estimate of effect or accuracy is very uncertain.

GRADE: recommendation – quality of evidence Clear separation: 1) 4 categories of quality of evidence:  (High),   (Moderate),   (Low),   (Very low) ? – methodological quality of evidence – likelihood of bias – by outcome and across outcomes 2) Recommendation: 2 grades – conditional (aka weak) or strong (for or against an intervention)? – Balance of benefits and downsides, values and preferences, resource use and quality of evidence *

GRADE Quality of Evidence In the context of a systematic review The quality of evidence reflects the extent to which we are confident that an estimate of effect is correct. In the context of making recommendations The quality of evidence reflects the extent to which our confidence in an estimate of the effect is adequate to support a particular recommendation.

Determinants of quality RCTs  observational studies   5 factors that can lower quality 1.limitations in detailed design and execution (risk of bias criteria) 2.Inconsistency (or heterogeneity) 3.Indirectness (PICO and applicability) 4.Imprecision (number of events and confidence intervals) 5.Publication bias 3 factors can increase quality 1.large magnitude of effect 2.all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed 3.dose-response gradient

1. Design and Execution/Risk of Bias Examples: Inappropriate selection of exposed and unexposed groups Failure to adequately measure/control for confounding Selective outcome reporting Failure to blind (e.g. outcome assessors) High loss to follow-up Lack of concealment in RCTs Intention to treat principle violated

Design and Execution/RoB From Cates, CDSR 2008

Design and Execution/RoB Overall judgment required

2. Inconsistency of results (Heterogeneity) if inconsistency, look for explanation – patients, intervention, comparator, outcome if unexplained inconsistency lower quality

Reminders for immunization uptake

Judgment – variation in size of effect – overlap in confidence intervals – statistical significance of heterogeneity –I2–I2

Inconsistency when 1 study? Do not downgrade

3. Directness of Evidence generalizability, transferability, applicability differences in – populations/patients (HIC – L/MIC, women in general – pregnant women) – interventions (all techniques, new - old) – comparator appropriate (newer technique – old or no technique) – outcomes (important – surrogate: CIN I – cancer) indirect comparisons – interested in A versus B – have A versus C and B versus C – Cryo + antibiotics versus no intervention versus Cryo - antibiotics

EVIDENCE PROFILE Question: Cyrotherapy with antibiotics vs no antibiotics for histologically confirmed CIN 1 All rates presented at 12 months with assumption that events would occur within this time frame. 2 Indirect analysis between single arm observational studies Quality assessmentNo of patientsEffect QualityImportance No of studies DesignLimitations Inconsisten cy IndirectnessImprecisionOther Cryotherap y with antibiotics No antibiotics Relative (95% CI) Absolute Major infection (follow-up 12 months 1 ; requiring hospitalisation or blood transfusion) 16observation al studies no serious limitations no serious inconsistenc y serious 2 no serious imprecision none 0/1600 (0%) 10/4573 (0.22%) RD 0 (0 to 0) 0 fewer per 1000  IMPORTANT Resource use - not measured All severe adverse events (follow-up 12 months; (major infections and bleeding, pelvic inflammatory disease, stenosis, etc ) 17observation al studies no serious limitations no serious inconsistenc y serious 2 no serious imprecision none 0/1705 (0%) 22/5142 (0.43%) RD 0 (0 to 0) 0 fewer per 1000  IMPORTANT

4. Publication Bias Should always be suspected – Only small “positive” studies – For profit interest – Various methods to evaluate – none perfect, but clearly a problem

Egger M, Smith DS. BMJ 1995;310: I.V. Mg in acute myocardial infarction Publication bias Meta-analysis Yusuf S.Circulation 1993 ISIS-4 Lancet 1995

Egger M, Cochrane Colloquium Lyon Funnel plot Standard Error Odds ratio Symmetrical: No publication bias

Egger M, Cochrane Colloquium Lyon Funnel plot Standard Error Odds ratio Asymmetrical: Publication bias? 0.4

5. Imprecision Small sample size – small number of events Wide confidence intervals – uncertainty about magnitude of effect Extent to which confidence in estimate of effect adequate to support decision

Example: Immunization in children

For systematic reviews If the 95% CI excludes a relative risk (RR) of 1.0 and the total number of events or patients exceeds the OIS criterion, precision is adequate. If the 95% CI includes appreciable benefit or harm (we suggest a RR of under 0.75 or over 1.25 as a rough guide) rating down for imprecision may be appropriate even if OIS criteria are met.

Optimal information size We suggest the following: if the total number of patients included in a systematic review is less than the number of patients generated by a conventional sample size calculation for a single adequately powered trial, consider rating down for imprecision. Authors have referred to this threshold as the “optimal information size” (OIS)

025.0%

0

0

0

Ischemic stroke point estimate and confidence interval Figure 1, Rating down for imprecision in guidelines: Thresholds are key Favors Intervention Favors Control Risk difference (%) Threshold if side effects and toxicity appreciable, NNT = 100. Confidence interval crosses threshold, rate down for imprecision Threshold if side effects, toxicity and cost minimal, NNT = 200. Entire confidence interval to left of threshold, do not rate down for imprecision

Figure 2: Corticosteroids to reduce hospital mortality in septic shock

Figure 4: Optimal information size given alpha of 0.05 and beta of 0.2 for varying control event rates and relative risks For any chosen line, evidence meets optimal information size criterion if sample size above the line

Total Number of EventsRelative Risk ReductionImplications for meeting OIS threshold 100 or less< 30%Will almost never meet threshold whatever control event rate 20030%Will meet threshold for control event rates for ~ 25% or greater 20025%Will meet threshold for control event rates for ~ 50% or greater 20020% Will meet threshold only for control event rates for ~ 80% or greater 300> 30%Will meet threshold 30025%Will meet threshold for control event rates ~ 25% or greater 30020%Will meet threshold for control event rates ~ 60% or greater 400 or more> 25%Will meet threshold for any control event rate 400 or more20%Will meet threshold for control event rates of ~ 40% or greater Table 1: Optimal information size implications from Figure 5

What can raise quality? 1. large magnitude can upgrade (RRR 50%/RR 2) – very large two levels (RRR 80%/RR 5) – criteria everyone used to do badly almost everyone does well

BMJ 2003 BMJ, 2003

Reminders for immunization uptake

What can raise quality? 2. dose response relation – (higher INR – increased bleeding) – childhood lymphoblastic leukemia risk for CNS malignancies 15 years after cranial irradiation no radiation: 1% (95% CI 0% to 2.1%) 12 Gy: 1.6% (95% CI 0% to 3.4%) 18 Gy: 3.3% (95% CI 0.9% to 5.6%)

In terms of high altitude sickness, symptoms generally do not manifest below 1500 m. From about 1500 to 2500 m, symptoms are generally mild, if experienced at all. At 2500 m, symptoms of mild to moderate acute mountain sickness (AMS) become quite common among unacclimatized visitors after rapid ascent. At this altitude high altitude pulmonary edema (HAPE) may also occur, but it is more common above 3000 m. Above 3000 to 4000 m, AMS is common among people who have not properly acclimatized, and the risk of severe consequences, including life-threatening HAPE and cerebral edema, is substantial.

What can raise quality? 3. all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed

All plausible residual confounding would result in an overestimate of effect  Hypoglycaemic drug phenformin causes lactic acidosis  The related agent metformin is under suspicion for the same toxicity.  Large observational studies have failed to demonstrate an association – Clinicians would be more alert to lactic acidosis in the presence of the agent Vaccine – adverse effects

Quality assessment criteria

Pulling it all together and drawing conclusions

carefully consider and assess all the factors that may influence the quality of evidence bear in mind that down- and upgrading for specific quality factors should be done in the context of all of the factors that influence the quality of evidence downgrading for one quality criterion may influence how the next quality criterion is dealt with

Within and among Downgrade or upgrade on a continuum Downgrade or upgrade – WITHIN each category – AMONG the categories

Example: Meta- analysis of 5 studies uncertainty about three factors: study limitations/RoB, inconsistency, and imprecision Uncertainty not serious enough to downgrade each factor Option to pick one or two levels to downgrade Indicate in footnotes why and why did not downgrade for those factors (e.g. There was some uncertainty but already downgraded for...)

Survival HR 0.77 (0.65 to 0.91)

How confident are you that these results are true?

Study limitations

No, there are no serious limitations Yes, there are serious limitations Yes, there are very serious limitations Would you downgrade for risk of bias?

From risk of bias to limitations in design

Quality now? High

Inconsistency

Who believes there is important inconsistency (rather than random error)? No, there is no serious inconsistency Yes, there is serious inconsistency Yes, there is very serious inconsistency

Quality now? High

Indirectness Direct comparison? Population? Intervention? Outcome?

Quality now? High

Publication bias

Quality now? High

Imprecision

Quality now? High No upgrading

Major bleeding RR 1.50 (0.26 – 8.80)

Study limitations

No, there are no serious limitations – although there …. Yes, there are serious limitations – most people would agree that selective reporting is…. Yes, there are very serious limitations – there is a risk of bias but only for the one criteria of selective reporting Would you downgrade for risk of bias?

From risk of bias to limitations in design

Quality now? Moderate

Imprecision

Quality now? Low Observational studies could have provided higher quality evidence

Flavanoids for Hemorrhoids venotonic agents – mechanism unclear, increase venous return popularity – 90 venotonics commercialized in France – none in Sweden and Norway – France 70% of world market possibilities – French misguided – rest of world missing out

Systematic Review 14 trials, 1432 patients key outcome – risk not improving/persistent symptoms – 11 studies, 1002 patients, 375 events – RR 0.4, 95% CI 0.29 to 0.57 minimal side effects is France right? what is the quality of evidence?

What can lower quality? Study limitations/risk of bias – lack of detail re concealment – questionnaires not validated rate down quality for study limitations/RoB? indirectness – no problem inconsistency, need to look at the results

Publication bias? size of studies – 40 to 234 patients, most around 100 all industry sponsored

What can lower quality? risk of bias – lack of detail re concealment – questionnaires not validated Inconsistency – heterogeneity p < 0.001; I % indirectness imprecision – RR 0.4, 95% CI 0.29 to 0.57 Publication bias – 40 to 234 patients, most around 100

Conclusions  WHO guidelines should be based on the best available evidence to be evidence based  GRADE is the approach used by WHO and gaining acceptance internationally  combines what is known in health research methodology and provides a structured approach to improve communication  Does not avoid judgments but provides framework  Criteria for evidence assessment across questions and outcomes  Criteria for moving from evidence to recommendations  Transparent, systematic  four categories of quality of evidence  two grades for strength of recommendations  Transparency in decision making and judgments is key