© OCS Consulting The flexible extension to your IT team 1 Embedding equivalence t-test results in Bland Altman Plots visualising rater reliability Jim.

Slides:



Advertisements
Similar presentations
Test Development.
Advertisements

Standardized Scales.
Design of Experiments Lecture I
Correlation, Reliability and Regression Chapter 7.
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
© OCS Consulting 1 SAS Macro Version Control Jim Groeneveld, OCS Consulting, Rosmalen, the Netherlands – SGF 2007.
Epidemiologic Methods- Fall Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays.
Statistics Unit 6.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Ahmed Alsanousi CSC 464.  Existing instruments  Oscilloscope  Re-use rating scale  Create an instrument  Survey  Checklist  Paper and pencil 
Lecture 23: Tues., Dec. 2 Today: Thursday:
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
A Review of Probability and Statistics
Concept of Measurement
Intermediate methods in observational epidemiology 2008 Quality Assurance and Quality Control.
Ch. 2: The Art of Presenting Data Data in raw form are usually not easy to use for decision making. Some type of organization is needed Table and Graph.
Nominal Level Measurement n numbers used as ways to identify or name categories n numbers do not indicate degrees of a variable but simple groupings of.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Analyzing quantitative data – section III Week 10 Lecture 1.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Copyright © 2014 Pearson Education, Inc.12-1 SPSS Core Exam Guide for Spring 2014 The goal of this guide is to: Be a side companion to your study, exercise.
Data measurement, probability and statistical tests
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Analyzing categorical data S-012. Categorical data Non-continuous (discrete values) Categories such as: – “high” or “low” – Yes / no – Graduate / non-graduate.
Analysis & Interpretation: Individual Variables Independently Chapter 12.
Epidemiologic Methods. Definitions of Epidemiology The study of the distribution and determinants (causes) of disease –e.g. cardiovascular epidemiology.
© OCS Consulting The flexible extension to your IT team 1 Jim Groeneveld, OCS Consulting, ´s Hertogenbosch, Netherlands. PhUSE 2011 Comparing dataset metadata.
Instrumentation.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
A leading global CRO Systematically Reordering Axis in SAS Graph Brian Shen Web:
Data Collection and Processing (DCP) 1. Key Aspects (1) DCPRecording Raw Data Processing Raw Data Presenting Processed Data CompleteRecords appropriate.
Chapter 16 Data Analysis: Testing for Associations.
Categorical data analysis: An overview of statistical techniques AnnMaria De Mars The Julia Group AnnMaria De Mars The Julia Group.
Effects: Example Low High Variable 2.
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
CHAPTER 10 Analysing quantitative data and formulating conclusions.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
Measurement MANA 4328 Dr. Jeanne Michalski
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Sample Size Determination
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
Course: Research in Biomedicine and Health III Seminar 5: Critical assessment of evidence.
Box and Whisker Plots Example: Comparing two samples.
EDU5950 SEM RELIABILITY ANALYSIS -CRONBACH ALPHA TEST FOR NORMALITY.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
Chapter 5: Organizing and Displaying Data. Learning Objectives Demonstrate techniques for showing data in graphical presentation formats Choose the best.
Additional Regression techniques Scott Harris October 2009.
Statistics Review  Mode: the number that occurs most frequently in the data set (could have more than 1)  Median : the value when the data set is listed.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
1 Measuring Agreement. 2 Introduction Different types of agreement Diagnosis by different methods  Do both methods give the same results? Disease absent.
Quantitative Methods in the Behavioral Sciences PSY 302
Statistics Unit 6.
Quantitative Techniques – Class I
Measuring Intergroup Agreement and Disagreement
Measures of Agreement Dundee Epidemiology and Biostatistics Unit
Testing for moderators
Computing Reliability
Statistics Unit 6.
Natalie Robinson Centre for Evidence-based Veterinary Medicine
MANA 5341 Dr. George Benson Measurement MANA 5341 Dr. George Benson 1.
ERRORS, CONFOUNDING, and INTERACTION
The first test of validity
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Intermediate methods in observational epidemiology 2008
Presentation transcript:

© OCS Consulting The flexible extension to your IT team 1 Embedding equivalence t-test results in Bland Altman Plots visualising rater reliability Jim Groeneveld, OCS Consulting, ‘s Hertogenbosch, Netherlands. PhUSE 2011

© OCS Consulting The flexible extension to your IT team 2 Equivalence t-test & Bland Altman AGENDA / CONTENTS A.Rater reliability (inter- / intra-) B.Methods, variable type dependent C.Equivalence t-test (quantitative) D.Bland Altman Plots (qualitative) E.Integration of both, visualising equivalence t-test results in Bland Altman Plots, showing quantitative (in)significant equivalence in the plots F.Advantages of integration

© OCS Consulting The flexible extension to your IT team 3 Equivalence t-test & Bland Altman A.Rater reliability 1.Determine reliability of measuring instrument (device and/or human) 2.Repeated measurements (judgments by raters) on same objects a.by same instrument: intra-rater or within- rater reliability (2 or more repetitions) b.by similar, but other instrument: inter-rater or between-rater reliability (2 or more) 3.Application (before and after study): a.Certification on representative data (before) b.QC (on sample) of existing study data (after)

© OCS Consulting The flexible extension to your IT team 4 Equivalence t-test & Bland Altman B. Methods, variable type dependent 1.Categorial data (nominal or ordered) a.Cohen’s Kappa analysis (>2 cats: Fleiss) b.McNemar’s test (>2 cats: McNemar-Bowker) Application: non-missing vs missing (binary) 2.Continuous data (interval or ratio) a.Mean Absolute Difference (MAD) of pairs b.Intraclass Correlation Coefficient (ICC), pairs c.Equivalence t-test (quantitative interpretation) d.Bland Altman Plots (qualitative interpretation) Application: ordered multi-level categorical data

© OCS Consulting The flexible extension to your IT team 5 Equivalence t-test & Bland Altman C. Equivalence t-test (range limits) 1.on differences between paired measurements 2.two one-sided non-inferiority t-tests 3.user specification of equivalence range limits ((a)symmetrical) Result for each combination of pairs of matching, repeated measurements: 1.significant equivalence or not 2.depending on range limits

© OCS Consulting The flexible extension to your IT team 6 Equivalence t-test & Bland Altman D. Bland Altman Plots 1.Scattergram of pairwise points of: 2.Mean of pairs: X=(v 1 +v 2 )/2 versus 3.Difference of pairs: Y= v 1 -v 2 including 4.Horizontal line of mean difference and 5.Confidence Interval (CI) of points, upper and lower horizontal lines 6.Qualitative interpretation of reliability

© OCS Consulting The flexible extension to your IT team 7 Equivalence t-test & Bland Altman D. Bland Altman Plots (example)

© OCS Consulting The flexible extension to your IT team 8 Equivalence t-test & Bland Altman E. Integration of equivalence t-test and Bland Altman Plots 1.Scattergram of pairwise points of: 2.Mean of pairs: X=(v 1 +v 2 )/2 versus 3.Difference of pairs: Y= v 1 -v 2 including 4.Horizontal line of mean difference and 5.Confidence Interval (CI) of the mean, upper and lower horizontal lines 6.T-test range limits, horizontal lines 7.Quantitative interpretation of reliability

© OCS Consulting The flexible extension to your IT team 9 Equivalence t-test & Bland Altman E. Integration of equivalence t-test and Bland Altman Plots (example with significant equivalence)

© OCS Consulting The flexible extension to your IT team Equivalence t-test & Bland Altman E. Integration of equivalence t-test and Bland Altman Plots 1.visualising equivalence t-test results in Bland Altman Plots 2.showing quantitative significant equivalence in the plots 3.if the Confidence Interval of the mean lies fully within the T-test range limits there is significant equivalence 10

© OCS Consulting The flexible extension to your IT team Equivalence t-test & Bland Altman E. Integration of equivalence t-test and Bland Altman Plots (example with non-significant equivalence) 11

© OCS Consulting The flexible extension to your IT team 12 Equivalence t-test & Bland Altman F. Advantages of integration 1.Extension of (value of) Bland Altman Plots with quantitative interpretation on equivalence (in)significance 2.Equivalence (in)significance clearly visualised, depending on range limits 3.Results of two reliability analysis methods in one plot 4.showing a quantitative result and a qualitatively interpretable scatterplot

© OCS Consulting The flexible extension to your IT team 13 Equivalence t-test & Bland Altman QUESTIONS & ANSWERS

© OCS Consulting The flexible extension to your IT team Equivalence t-test & Bland Altman More than 2 matching measurements 1.Pairwise analysis of repetitions (may yield many pairs of more than 3) 2.If more than 3 reduce number of analyses to “pairs” consisting of: a.each individual measurement versus b.the mean of all other matching measurements This reduces the amount of “pairs” and analyses and facilitates an overall interpretation of the results. 14

© OCS Consulting The flexible extension to your IT team Equivalence t-test & Bland Altman A SAS macro (Concord) is currently under development in which these techniques already are supported and applied. Additional features: relative differences 1.difference between both values: Y = v 1 - v 2 2.proportional difference with mean of both: Y = (v 1 - v 2 ) / mean[v 1,v 2 ] = 2 * (v 1 - v 2 ) / (v 1 + v 2 ) 3.(relative) proportion of both values, minus 1: Y = (v 1 / v 2 ) - 1 = (v 1 - v 2 ) / v 2 4.proportion of 1 value of mean of both, minus 1: Y = (v 1 / mean[v 1,v 2 ]) -1 = (v 1 -v 2 ) / (v 1 +v 2 ) 15

© OCS Consulting The flexible extension to your IT team Equivalence t-test & Bland Altman SAS Macro TickMark (version 0.0.1) Neat automatic ticmarks for graphs based on minimum and maximum of an existing value range (tickmarks 1 to 2 significant digits). Optional specification: desired minimum and maximum number of tick marks and minimum percentage of coverage of existing data range by generated value range (default values: minimum=7, maximum=12, pct coverage=80). Return of From, To and By values via macro variables or as a single return value. 16