Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
Consistency in testing
Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
The Department of Psychology
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Measurement the process by which we test hypotheses and theories. assesses traits and abilities by means other than testing obtains information by comparing.
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Reliability and Validity of Research Instruments
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
RESEARCH METHODS Lecture 18
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Behavioural Science II Week 1, Semester 2, 2002
RELIABILITY consistency or reproducibility of a test score (or measurement)
A quick introduction to the analysis of questionnaire data John Richardson.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Research Methods in MIS
Measurement and Data Quality
Reliability, Validity, & Scaling
Instrumentation.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Reliability Chapter 3. Classical Test Theory Every observed score is a combination of true score plus error. Obs. = T + E.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
All Hands Meeting 2005 The Family of Reliability Coefficients Gregory G. Brown VASDHS/UCSD.
CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Measurement and Questionnaire Design. Operationalizing From concepts to constructs to variables to measurable variables A measurable variable has been.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Measures of Reliability in Sports Medicine and Science Will G. Hopkins Sports Medicine 30(4): 1-25, 2000.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Reliability: Introduction. Reliability Session Definitions & Basic Concepts of Reliability Theoretical Approaches Empirical Assessments of Reliability.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Writing A Review Sources Preliminary Primary Secondary.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Chapter 6 Norm-Referenced Reliability and Validity.
Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints.
Ch. 5 Measurement Concepts.
Reliability.
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Reliability & Validity
Part II Knowing How to Assess Chapter 5 Minimizing Error
Evaluation of measuring tools: reliability
RESEARCH METHODS Lecture 18
Presentation transcript:

Reliability: Introduction

Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability 4.Interpreting Coefficients

1. Conceptions of Reliability “This patient is often late!” * “My car won’t start!” Roughly: ±  S.E.M. * Does this make him reliable or unreliable?

Measured Value = True Value plus any Systematic Error (Bias) plus Random Error The usefulness of a score depends on the ratio of its true value component to any error variance that it contains Classic view of the components of a measurement

Several sources of variance in test scores: Which to include in estimating reliability? Variance between patients Variance due to different observers Fluctuations over time: day of week or time of day Changes in the measurement instrument (reagents degrade) Changes in definitions (e.g. revised diagnostic codes) Random errors (various sources)

Reliability Subject Variability Subject variability + Measurement Error Reliability = Subject Variability Subject Var. + Observer Variability + Meas’t Error or,

Generalizability Theory An ANOVA model that estimates each source of variability separately: –Observer inconsistency over time –Discrepancies between observers –Changes in subject being assessed over time Quantifies these Helps to show how to optimize design (and administration) of test given these performance characteristics.

2. Classical Test Theory Distinguishes random error from systematic, or bias. Random = unreliability; bias = invalidity. Classical test theory assumes: –Errors are independent of the score (i.e. similar errors occur at all levels of the variable being measured) –Mean of errors = zero (some increase & some decrease the score; these errors balance out) Hence, random errors tend to cancel out if enough observations are made, so a large sample can give you an accurate estimate of the population mean even if the measure is unreliable. Useful! –From the above, Observed score = True score + Error (additive: no interaction between score and error)

Reliability versus Sensitivity of a Measurement: Metaphor of the combs Fine-grained scale may produce more error variance Coarse measure will appear more stable but is less sensitive

Reliability and Precision Some sciences use ‘precision’ to refer to the close grouping of results that in the metaphor of the shooting target we called ‘reliability’. You may also see ‘accuracy’ used in place of our ‘validity’. These terms are common in laboratory disciplines, and you should be aware of the contrasting usage. In part this difference arises because in the social sciences, measurements need to distinguish between 3 concepts: reliability and validity, plus the level of detail a measure is capable of revealing – the number of significant digits it provides. Thus, rating pain as “moderate” is imprecise and yet could be done reliably, and it may also be valid (as far as we can tell!) By contrast, mechanical measurements in laboratory sciences can be sufficiently consistent that they have little need for our concept of reliability.

3. Consistency over time One way to test reliability is to repeat the measurement. If you get the same score, it’s reliable. But this runs into problem that a real change in health may occur over time, giving a falsely negative impression of reliability. Alternatively, people may remember their replies, perhaps falsely inflating reliability. To avoid this you could correlate different, but equivalent, versions of the test. One approach is to divide the whole test into two halves and correlate them (“Split-half reliability”)  Formulas by Kuder & Richardson, & Cronbach’s alpha generalize this. Leads to internal consistency: a reliable test is one with items that are very similar Test this using item-total correlations The more items, the lower the error  Spearman-Brown formula estimates this: # items reliability

4. Statistics to use: Intra-class correlation vs. Pearson r ICC = 1.0; r = 1.0 r = 1.0; ICC < 1.0 Systematic error: bias Message: a re-test correlation will ignore a systematic change in scores over time. An ICC measures agreement, so will penalize retest reliability when a shift occurs. Which do you prefer?

Self-test fun time! What is the Reliability when: Every student is rated “above average” Physician A rates every BP as 5 mm Hg higher than physician B The measure is applied to a different population The observers change The patients do, in reality, improve over time?