Reliability IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Standardized Scales.
Chapter 8 Flashcards.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
How good are our measurements? The last three lectures were concerned with some basics of psychological measurement: What does it mean to quantify a psychological.
MEASUREMENT: RELIABILITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Chapter Eight & Chapter Nine
Consistency in testing
Topics: Quality of Measurements
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
The Department of Psychology
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Measurement the process by which we test hypotheses and theories. assesses traits and abilities by means other than testing obtains information by comparing.
Methods for Estimating Reliability
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Concept of Measurement
Reliability and Validity
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
Research Methods in MIS
Chapter 7 Correlational Research Gay, Mills, and Airasian
Measurement and Data Quality
Reliability, Validity, & Scaling
Foundations of Educational Measurement
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.
Reliability & Validity
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Chapter 13 Understanding research results: statistical inference.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Measurement and Scaling Concepts
Professor Jim Tognolini
Assessment Theory and Models Part II
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Classical Test Theory Margaret Wu.
Reliability & Validity
Introduction to Measurement
پرسشنامه کارگاه.
Calculating Reliability of Quantitative Measures
PSY 614 Instructor: Emily Bullock, Ph.D.
Evaluation of measuring tools: reliability
MANA 5341 Dr. George Benson Measurement MANA 5341 Dr. George Benson 1.
The first test of validity
Measurement Concepts and scale evaluation
Presentation transcript:

Reliability

IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give us the same result over and over again (assuming that what we are measuring isn’t changing!)

IOP 301-T Mr. Rajesh Gunesh Definition of Reliability  Reliability usually “refers to the consistency of scores obtained by the same persons when they are reexamined with the same test on different occasions, or with different sets of equivalent items, or under other variable examining conditions (Anastasi & Urbina, 1997).  Dependable, consistent, stable, constant  Gives the same result over and over again

IOP 301-T Mr. Rajesh Gunesh Validity vs Reliability

IOP 301-T Mr. Rajesh Gunesh Variability and reliability  What is the acceptable range of error in measurement – Bathroom scale ±1 kg – Body thermometer ±0.2 C – Baby weight scale ±20 g – Clock with hands ±5 min – Outside thermometer ±1 C

IOP 301-T Mr. Rajesh Gunesh Variability and reliability We are completely comfortable with a bathroom scale accurate to ±1 kg, since we know that individual weights vary over far greater ranges than this, and typical changes from day to day are about the same order of magnitude.

IOP 301-T Mr. Rajesh Gunesh Reliability  True Score Theory  Measurement Error  Theory of reliability  Types of reliability  Standard error of measurement

IOP 301-T Mr. Rajesh Gunesh True Score Theory

IOP 301-T Mr. Rajesh Gunesh True Score Theory  Every measurement is an additive composite of two components: 1. True ability (or the true level) of the respondent on that measure 2. Measurement error

IOP 301-T Mr. Rajesh Gunesh True Score Theory  Individual differences in test scores – “True” differences in characteristic being assessed – “Chance” or random errors.

IOP 301-T Mr. Rajesh Gunesh True Score Theory  What might be considered error variance in one situation may be true variance in another (e.g Anxiety)

IOP 301-T Mr. Rajesh Gunesh Can we observe the true score? X = T + e x  We only observe the measurement, we don’t observe what’s on the right side of equation (only God knows what those values are)

IOP 301-T Mr. Rajesh Gunesh True Score Theory var(X) = var(T) + var( e x )  The variability of the measure is the sum of the variability due to true score and the variability due to random error

IOP 301-T Mr. Rajesh Gunesh What is error variance? Conditions irrelevant to purpose of the test – Environment (e.g., quiet v. noisy) – Instructions (e.g., written v. verbal) – Time limits (e.g., limited v. unlimited) – Rapport with test taker  All test scores have error variance.

IOP 301-T Mr. Rajesh Gunesh Measurement Error  Measurement error: – Random – Systematic

IOP 301-T Mr. Rajesh Gunesh Measurement Error

IOP 301-T Mr. Rajesh Gunesh Measurement Error  Random error: effects are NOT consistent across the whole sample, they elevate some scores and depress others – Only adds noise; does not affect mean score

IOP 301-T Mr. Rajesh Gunesh Measurement Error  Systematic error: effects are generally consistent across a whole sample – Example: environmental conditions for group testing (e.g., temperature of the room) – Generally either consistently positive (elevate scores) or negative (depress scores)

IOP 301-T Mr. Rajesh Gunesh Measurement Error

IOP 301-T Mr. Rajesh Gunesh Measurement Error

IOP 301-T Mr. Rajesh Gunesh Theory of Reliability

IOP 301-T Mr. Rajesh Gunesh Reliability Reliability = The variance of the true score The variance of the measure Reliability = Var(T) Var(X)

IOP 301-T Mr. Rajesh Gunesh How big is an estimate of Reliability? Var(T) Reliability = Var(T) Var(X) = Var(T) + Var( e ) Reliability = Subject variability Subject variability + measurement error

IOP 301-T Mr. Rajesh Gunesh  We can’t compute reliability because we can’t calculate the variance of the true score; but we can get an estimate of the variability.

IOP 301-T Mr. Rajesh Gunesh Estimate of Reliability  Observations would be related to each other to the degree that they share true scores. For example consider the correlation between X 1 and X 2 :

IOP 301-T Mr. Rajesh Gunesh

IOP 301-T Mr. Rajesh Gunesh Types of Reliability 1.Test-Retest Reliability Used to assess the consistency of a measure from one time to another 2.Alternate-form Reliability Used to assess the consistency of the results of two tests constructed the same way from the same content domain

IOP 301-T Mr. Rajesh Gunesh Types of Reliability 3.Split-half Reliability Used to assess the consistency of results across items within a test by splitting them into two equivalent halves  Kuder-Richardson Reliability Used to assess the extent to which items are homogenous when items have a dichotomous response, e.g. “yes/no” items.

IOP 301-T Mr. Rajesh Gunesh Types of Reliability  Cronbach’s alpha (α) Reliability Compares the consistency of response of all items on the scale (Likert scale or linear graphic response format) 4.Inter-Rater or Inter-Scorer Reliability Used to assess the concordance between two or more observers scores of the same event or phenomenon for observational data

IOP 301-T Mr. Rajesh Gunesh Test-Retest Reliability  Definition: When the same test is administered to the same individual (or sample) on two different occasions

IOP 301-T Mr. Rajesh Gunesh Test-Retest Reliability: Used to assess the consistency of a measure from one time to another

IOP 301-T Mr. Rajesh Gunesh Test-Retest Reliability  Statistics used – Pearson r or Spearman rho  Warning – Correlation decreases over time because error variance INCREASES (and may change in nature) – Closer in time the two scores were obtained, the more the factors which contribute to error variance are the same

IOP 301-T Mr. Rajesh Gunesh Test-Retest Reliability  Warning – Circumstances may be different for both test-taker and physical environment. – Transfer effects like practice and memory might play a role on the second testing occasion

IOP 301-T Mr. Rajesh Gunesh Alternate-form Reliability  Definition: Two equivalent forms of the same measure are administered to the same group on two different occasions

IOP 301-T Mr. Rajesh Gunesh Alternate-form Reliability: Used to assess the consistency of the results of two tests constructed same way from the same content domain

IOP 301-T Mr. Rajesh Gunesh Alternate-form Reliability  Statistic used – Pearson r or Spearman rho  Warning – Even when randomly chosen, the two forms may not be truly parallel – It is difficult to construct equivalent tests

IOP 301-T Mr. Rajesh Gunesh Alternate-form Reliability  Warning – Even when randomly chosen, the two forms may not be truly parallel – It is difficult to construct equivalent tests – The tests should have the same number of items, same scoring procedure, uniform content and item difficulty level

IOP 301-T Mr. Rajesh Gunesh Split-half Reliability  Definition: Randomly divide the test into two forms; calculate scores for Form A, B; calculate Pearson r as index of reliability

IOP 301-T Mr. Rajesh Gunesh Split-half Reliability

IOP 301-T Mr. Rajesh Gunesh Split-half Reliability (Spearman-Brown formula)

IOP 301-T Mr. Rajesh Gunesh Split-half Reliability  Warning The correlation between the odd and even scores are generally an underestimation of the reliability coefficient because it is based only on half the test.

IOP 301-T Mr. Rajesh Gunesh Cronbach’s alpha & Kuder-Richardson-20 Measures the extent to which items on a test are homogeneous; mean of all possible split-half combinations – Kuder-Richardson-20 (KR-20): for dichotomous data – Cronbach’s alpha: for non-dichotomous data

IOP 301-T Mr. Rajesh Gunesh Cronbach’s alpha (α)

IOP 301-T Mr. Rajesh Gunesh Cronbach’s alpha (α) (Coefficient alpha)

IOP 301-T Mr. Rajesh Gunesh Kuder-Richardson (KR-20)

IOP 301-T Mr. Rajesh Gunesh Inter-Rater or Inter-Observer Reliability: Used to assess the degree to which different raters or observers give consistent estimates of the same phenomenon

IOP 301-T Mr. Rajesh Gunesh Inter-rater Reliability  Definition Measures the extent to which multiple raters or judges agree when providing a rating of behavior

IOP 301-T Mr. Rajesh Gunesh Inter-rater Reliability  Statistics used – Nominal/categorical data Kappa statistic – Ordinal data Kendall’s tau to see if pairs of ranks for each of several individuals are related –Two judges rate 20 elementary school children on an index of hyperactivity and rank order them

IOP 301-T Mr. Rajesh Gunesh Inter-rater Reliability  Statistics used – Interval or ratio data Pearson r using data obtained from the hyperactivity index

IOP 301-T Mr. Rajesh Gunesh Factors affecting Reliability  Whether a measure is speeded  Variability in individual scores  Ability level

IOP 301-T Mr. Rajesh Gunesh Whether a measure is speeded For speeded measures, test-retest and equivalent-form reliability are more appropriate. Split-half techniques may be considered if the split occurs according to time rather than number of items.

IOP 301-T Mr. Rajesh Gunesh Variability in individual scores Correlation is normally affected by the range of individual differences in a group. Sometimes, smaller subgroups display correlation coefficients which are completely different from that of the whole group. This phenomenon is known as range restriction.

IOP 301-T Mr. Rajesh Gunesh Ability level One must also consider the variability and ability levels of samples. It is advisable to compute separate reliability coefficients for homogeneous and heterogeneous subgroups.

IOP 301-T Mr. Rajesh Gunesh Interpretation of Reliability One must ask oneself the following questions:  How high must the coefficient of reliability be?  How is it interpreted?  What is the standard error of measurement?

IOP 301-T Mr. Rajesh Gunesh Magnitude of reliability coefficient Anastasi & Urbina(1997)0.8 – 0.9 Huysamen (1996) at least 0.85 for individuals at least 0.65 for groups Smit (1996) 0.8 – 0.85 for personality & interest at least 0.9 for aptitude

IOP 301-T Mr. Rajesh Gunesh Standard Error of the Measurement  Definition: Estimate of the amount of error usually attached to an individual’s obtained test score – As SEM ↑, test reliability ↓ – As SEM ↓, test reliability ↑

IOP 301-T Mr. Rajesh Gunesh Standard Error of the Measurement

IOP 301-T Mr. Rajesh Gunesh Standard Error of the Measurement  Confidence Interval: Uses SEM to calculate a band or range of scores that has a high probability of including the person’s true score.  Example: 95% confidence interval means only 5 times in 100 will the person’s TRUE score lie outside this range of scores.

IOP 301-T Mr. Rajesh Gunesh Reliability  Formula: CI = Obtained score + z(SEM) z = 1.0 for 68% level z = 1.44 for 85% level z = 1.65 for 90% level z = 1.96 for 95% level z = 2.58 for 99% level

IOP 301-T Mr. Rajesh Gunesh Reliability of standardized tests  An acceptable standardized test should have reliability coefficients of at least:  0.95 for internal consistency  0.90 for test-retest (stability)  0.85 for alternate-forms (equivalency )

IOP 301-T Mr. Rajesh Gunesh Reliability: Implications  Evaluating a test – What types of reliability have been calculated and with what samples? – What are the strengths of the reliability coefficients? – What is the SEM for a test score – How does this information influence decision to use and interpret test scores?