Download presentation
Presentation is loading. Please wait.
Published byOsborn Palmer Modified over 9 years ago
1
Classroom Assessment Reliability
2
Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students. –Consistency within teachers over multiple occasions for students. –Consistency across teachers for the same students. –Consistency across teachers across students.
3
Three Types of Reliability Stability reliability. Alternate form reliability. Internal consistency reliability.
4
Stability Reliability –Concerned with the question: Are assessment results consistent over time (over occasions). Think of some examples where stability reliability might be important. Why might test results NOT be consistent over time?
5
Evaluating Stability Reliability –Test-Retest Reliability. Compute the correlation between a first and later administration of the same test. –Classification-consistency. Compute the percentage of consistent student classifications over time. (Example on next slide). –Main concern is with the stability of the assessment over time.
6
Example of Classification Consistency Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed Middle 3 ed Lower 3 ed
7
Example of Classification Consistency (Good Reliability) Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 3552 Middle 3 ed 4326 Lower 3 ed 1338
8
Example of Classification Consistency (Poor Reliability) Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 13154 Middle 3 ed 10248 Lower 3 ed 111018
9
Alternate-form Reliability Are two, supposedly equivalent, forms of an assessment in fact actually equivalent? –The two forms do not have to yield identical scores. –The correlation between two or more forms of the assessment should be reasonably substantial.
10
Evaluating Alternate-form Reliability Administer two forms of the assessment to the same individuals and correlate the results. Determine the extent to which the same students are classified the same way by the two forms. Alternate-form reliability is established by evidence, not by proclamation.
11
Example of Using a Classification Table to Assess Alternate-Form Reliability Alternate-Form Reliability Classification Table Good ReliabilityForm B Form AUpper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 621 Middle 3 ed 172 Lower 3 ed 037
12
Example of Using a Classification Table to Assess Alternate-Form Reliability Alternate-Form Reliability Classification Table Poor ReliabilityForm B Form AUpper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 324 Middle 3 ed 243 Lower 3 ed 235
13
Internal Consistency Reliability Concerned with the extent to which the items (or components) of an assessment function consistently. To what extent do the items in an assessment measure a single attribute? For example, consider a math problem-solving test. To what extent does reading comprehension play a role? What is being measured?
14
Evaluating Internal Consistency Reliability Split-Half Correlations. Kuder-Richardson Formua (KR20). –Used with binary-scored (dichotomous) items. –Average of all possible split-half correlations. Cronbach’s Coefficient Alpha. –Similar to KR20, except used with non-binary scored (polytomous) items (e.g., items that measure attitude.
15
Reliability Components of an Observation O = T + E Observation = True Status + Error.
16
Standard Error of Measurement Provides an index of the reliability of an individual’s score. The standard deviation of the theoretical distribution of errors (i.e. the E’s). The more reliable a test, the smaller the SEM.
17
Sources of Error in Measurement Individual characteristics –Anxiety –Motivation –Health –Fatigue –Understanding (of task) –“Bad hair day” External characteristics –Directions –Environmental disturbances –Scoring errors –Observer differences/biases –Sampling of items
18
Things to Do to Improve Reliability Use more items or tasks. Use items or tasks that differentiate among students. Use items or tasks that measure within a single content domain. Keep scoring objective. Eliminate (or reduce) extraneous influences Use shorter assessments more frequently.
19
End
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.