Download presentation
Presentation is loading. Please wait.
1
Lesson Seven Reliability
2
Contents Definition of reliability Definition of reliability Indication of reliability: Reliability coefficient Reliability coefficient Ways of obtaining reliability coefficient: Alternate/Parallel forms Test, retest Split-half (Inter-) rater (or scorer) reliability (Inter-) rater (or scorer) reliability Two ways of testing reliability Two ways of testing reliability How to make sure the test is reliable How to make sure the test is reliable
3
Definition of Reliability “ The consistency of measures across different times, test forms, raters, and other characteristics of the measurement context ” (Bachman, 1990, p. 24). The accuracy or precision with which a test measures something; consistency, dependability, or stability of test results.
4
Reliability coefficient (r) To quantify the reliability of a test allow us to compare the reliability of different tests. 0 ≤ r ≤ 1 (ideal r= 1, which means the test gives precisely the same results for particular testees regardless of when it happened to be administered). If r = 1 : 100% reliable A good achievement test: r>=.90
5
Alternate/Parallel forms: the most stringent form Two forms, two administrations Equivalent forms (i.e., different items testing the same topic) taken by the same test taker on different days If r is high, this test is said to have good reliability. Test plan Form AForm B
6
Test, retest The same test is administered to the same testees with a short time lag, and then calculate r. Appropriate for highly speeded test Test A Trial 1Trial 2 One form, two administrations
7
Split-half (Spearman-Brown Procedure) One test, one administration Split the test into halves (i.e., odd questions vs even questions) to form two sets of scores. Internal consistency (also: KR-21, KR-20) Q1 Q2 Q3 Q4 Q5 Q6 First Half Second Half
8
(Inter-) rater (or scorer) reliability Needed for subjective tests (e.g., writing, oral tests) when two or more independent raters are involved in scoring. Raters should be trained before scoring. Compare the scores of the same testee given by different raters. If r= high, there ’ s inter-rater reliability.
9
Ways of testing reliability Examine the amount of variation Standard Error of Measurement (SEM) The smaller the better Calculate “ reliability coefficient ” “ r ” The bigger the better
10
How to make sure the test is reliable for teachers Take enough samples of behavior Try to avoid ambiguous items Provide clear and explicit instructions Well layout Provide uniform and undistracted condition Try to use objective tests Try to use direct tests Have independent, trained raters Try to identify the test takers by number, not by names Try to have more multiple independent scoring in subjective tests (Hughes, 1989, pp. 36-41).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.