Estimating Reliability RCMAR/EXPORT Methods Seminar Series Drew (Cobb Room 131)/ UCLA (2 nd Floor at Broxton) December 12, 2011.

Slides:



Advertisements
Similar presentations
Números.
Advertisements

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Fill in missing numbers or operations
AP STUDY SESSION 2.
EuroCondens SGB E.
Worksheets.
Sequential Logic Design
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Addition and Subtraction Equations
Multiplication X 1 1 x 1 = 1 2 x 1 = 2 3 x 1 = 3 4 x 1 = 4 5 x 1 = 5 6 x 1 = 6 7 x 1 = 7 8 x 1 = 8 9 x 1 = 9 10 x 1 = x 1 = x 1 = 12 X 2 1.
Division ÷ 1 1 ÷ 1 = 1 2 ÷ 1 = 2 3 ÷ 1 = 3 4 ÷ 1 = 4 5 ÷ 1 = 5 6 ÷ 1 = 6 7 ÷ 1 = 7 8 ÷ 1 = 8 9 ÷ 1 = 9 10 ÷ 1 = ÷ 1 = ÷ 1 = 12 ÷ 2 2 ÷ 2 =
Disability status in Ethiopia in 1984, 1994 & 2007 population and housing sensus Ehete Bekele Seyoum ESA/STAT/AC.219/25.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
2.11.
Identifying money correctly 1
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
Learning to show the remainder
The 5S numbers game..
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Sampling in Marketing Research
Break Time Remaining 10:00.
This module: Telling the time
The basics for simulations
PP Test Review Sections 6-1 to 6-6
Psychometric Evaluation of Multi-Item Scales Questionnaire Design and Testing Workshop February 6, 2013, 1-2:30pm Wilshire Blvd. Suite 710 Los Angeles,
The Pecan Market How long will prices stay this high?? Brody Blain Vice – President.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
1 Prediction of electrical energy by photovoltaic devices in urban situations By. R.C. Ott July 2011.
15. Oktober Oktober Oktober 2012.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Progressive Aerobic Cardiovascular Endurance Run
1 Core Segments: Price Value Shoppers : Very much focused on getting the best value for their money, Price Value Shoppers love to shop, and take pride.
Comparison of X-ray diffraction patterns of La 2 CuO 4+   from different crystals at room temperature Pia Jensen.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
When you see… Find the zeros You think….
Splines IV – B-spline Curves
LN-251 SimINERTIAL Performance
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
2.10% more children born Die 0.2 years sooner Spend 95.53% less money on health care No class divide 60.84% less electricity 84.40% less oil.
Subtraction: Adding UP
: 3 00.
5 minutes.
Numeracy Resources for KS2
2.4 Bases de Dados Estudo de Caso. Caso: Caixa Eletrônico Caixa Eletrônico com acesso à Base de Dados; Cada cliente possui:  Um número de cliente  Uma.
Static Equilibrium; Elasticity and Fracture
Resistência dos Materiais, 5ª ed.
Clock will move after 1 minute
Select a time to count down from the clock above
Distributed Computing 9. Sorting - a lower bound on bit complexity Shmuel Zaks ©
Patient Survey Results 2013 Nicki Mott. Patient Survey 2013 Patient Survey conducted by IPOS Mori by posting questionnaires to random patients in the.
Introduction Embedded Universal Tools and Online Features 2.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Evaluating Multi-Item Scales Health Services Research Design (HS 225A) November 13, 2012, 2-4pm (Public Health)
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Psychometric Evaluation of Questionnaire Design and Testing Workshop December , 10:00-11:30 am Wilshire Suite 710 DATA.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire.
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Evaluating Multi-Item Scales Health Services Research Design (HS 225B) January 26, 2015, 1:00-3:00pm CHS.
Evaluating Multi-Item Scales
Evaluating Multi-Item Scales
Presentation transcript:

Estimating Reliability RCMAR/EXPORT Methods Seminar Series Drew (Cobb Room 131)/ UCLA (2 nd Floor at Broxton) December 12, 2011

Anonymous Dedication

Reliability Minimum Standards 0.70 or above (for group comparisons) 0.90 or higher (for individual assessment) SEM = SD (1- reliability) 1/2

Two Raters Ratings of GOP Debate Performance on Excellent to Poor Scale Bachman Turner Overdrive (Good, Very Good) Ging Rich (Very Good, Excellent) Rue Paul (Good, Good) Gaylord Perry (Fair, Poor) Romulus Aurelius (Excellent, Very Good) Sanatorium (Fair, Fair)

Cross-Tab of Ratings Rater 1Total PFGVGE P011 F11 G11 VG1012 E Rater 2

Calculating KAPPA P C = (0 x 1) + (2 x 1) + (2 x 1) + (1 x 2) + (1 x 1) =0.19 (6 x 6) P obs. = 2 = Kappa = 0.33– 0.19 =0.17(0.52, 0.77)

Linear and Quadratic Weighted Kappa PFGVGE P1.75 (.937).50 (.750).25 (.437)0 F.75 (.937)1.50 (.750).25 (.437) G.50 (.750).75 (.937)1.50 (.750) VG.25 (.437).50 (.750).75 (.937)1 E0.25 (.437).5 (.750).75 (.937)1 W i = 1 – ( i/ (k – 1) I = number of categories ratings differ by k = n of categories W i = 1 – (i 2 / (k – 1) 2

8 Intraclass Correlation and Reliability ModelIntraclass CorrelationReliability One- way Two- way fixed Two- way random BMS = Between Ratee Mean Square N = n of ratees WMS = Within Mean Square k = n of replicates JMS = Item or Rater Mean Square EMS = Ratee x Item (Rater) Mean Square

Reliability of Performance Ratings Candidates (BMS) Raters (JMS) Cand. x Raters (EMS) Total Source df SSMS 2-way R = 2 ( ) = (3.13) ICC = 0.80

GOP Presidential Candidates Responses to Two Questions about Their Health Bachman Turner Overdrive (Good, Very Good) Ging Rich (Very Good, Excellent) Rue Paul (Good, Good) Gaylord Perry (Fair, Poor) Romulus Aurelius (Excellent, Very Good) Sanatorium (Fair, Fair)

Two-Way Fixed Effects (Cronbachs Alpha) Respondents (BMS) Items (JMS) Resp. x Items (EMS) Total Source df SSMS Alpha = = 2.93 =

Overall Satisfaction of 12 Patients with 6 Doctors (2 patients per doctor) Dr. Overdrive (p1: Good, p2: Very Good) Dr. Rich (p3: Very Good, p4: Excellent) Dr. Paul (p5: Good, p6: Good) Dr. Perry (p7: Fair, p8: Poor) Dr. Aurelius (p9: Excellent, p10: Very Good) Dr. Sanatorium (p11: Fair, p12: Fair)

Reliability of Ratings of Doctor Respondents (BMS) Within (WMS) Total Source df SSMS 1-way = = 2.80 =

Candidates Perceptions of the U.S. Economy in November & December, 2011 Bachman Turner Overdrive (Good, Very Good) Ging Rich (Very Good, Excellent) Rue Paul (Good, Good) Gaylord Perry (Fair, Poor) Romulus Aurelius (Excellent, Very Good) Sanatorium (Fair, Fair) Which model would you use to estimate reliability?

Reliability and SEM For z-scores (mean = 0 and SD = 1): –Reliability = 1 – SE 2 –So reliability = 0.90 when SE = 0.32 For T-scores (mean = 50 and SD = 10): –Reliability = 1 – (SE/10) 2 –So reliability = 0.90 when SE = 3.2

In the past 7 days I was grouchy [1 st question] –Never –Rarely –Sometimes –Often –Always Theta = 56.1 SE = 5.7 (rel. = 0.68)

In the past 7 days … I felt like I was read to explode [2 nd question] –Never –Rarely –Sometimes –Often –Always Theta = 51.9 SE = 4.8 (rel. = 0.77)

In the past 7 days … I felt angry [3 rd question] –Never –Rarely –Sometimes –Often –Always Theta = 50.5 SE = 3.9 (rel. = 0.85)

In the past 7 days … I felt angrier than I thought I should [4 th question] –Never –Rarely –Sometimes –Often –Always Theta = 48.8 SE = 3.6 (rel. = 0.87)

In the past 7 days … I felt annoyed [5 th question] –Never –Rarely –Sometimes –Often –Always Theta = 50.1 SE = 3.2 (rel. = 0.90)

In the past 7 days … I made myself angry about something just by thinking about it. [6 th question] –Never –Rarely –Sometimes –Often –Always Theta = 50.2 SE = 2.8 (rel = 0.92)

Theta and SEM estimates 56 and 6 (reliability =.68) 52 and 5 (reliability =.77) 50 and 4 (reliability =.85) 49 and 4 (reliability =.87) 50 and 3 (reliability =.90) 50 and <3 (reliability =.92)

Thank you. Powerpoint file posted at URL below (freely available for you to use, copy or burn): Contact information: For a good time call or go to:

Appendices ANOVA Computations Candidate/Respondents SS ( )/2 – 38 2 /12 = Rater/Item SS ( )/6 – 38 2 /12 = 0.00 Total SS ( ) – 38 2 /10 = Res. x Item SS= Tot. SS – (Res. SS+Item SS)

options ls=130 ps=52 nocenter; options nofmterr; data one; input id 1-2 rater 4 rating 5; CARDS; ; run; **************;

proc freq; tables rater rating; run; *******************; proc means; var rater rating; run; *******************************************; proc anova; class id rater; model rating=id rater id*rater; run; *******************************************;

data one; input id 1-2 rater 4 rating 5; CARDS; ; run; ******************************************************************; %GRIP(indata=one,targetv=id,repeatv=rater,dv=rating, type=1,t1=test of GRIP macro,t2=); GRIP macro is available at:

data one; input id 1-2 rater1 4 rater2 5; control=1; CARDS; ; run; **************; DATA DUMMY; INPUT id 1-2 rater1 4 rater2 5; CARDS; RUN;

DATA NEW; SET ONE DUMMY; PROC FREQ; TABLES CONTROL*RATER1*RATER2 /NOCOL NOROW NOPERCENT AGREE; *******************************************; data one; set one; *****************************************; proc means; var rater1 rater2; run; *******************************************; proc corr alpha; var rater1 rater2; run;

Guidelines for Interpreting Kappa ConclusionKappaConclusionKappa Poor <.40 Poor < 0.0 Fair Slight Good Fair Excellent >.74 Moderate Substantial Almost perfect Fleiss (1981) Landis and Koch (1977)