Classroom Assessment Reliability. Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students.

Slides:



Advertisements
Similar presentations
Consistency in testing
Advertisements

Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Types of Reliability.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
The Department of Psychology
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Methods for Estimating Reliability
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Research Methods in MIS
Educational Assessment
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Measurement and Data Quality
Reliability Presented By: Mary Markowski, Stu Ziaks, Jules Morozova.
Instrumentation.
Foundations of Educational Measurement
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
SELECTION OF MEASUREMENT INSTRUMENTS Ê Administer a standardized instrument Ë Administer a self developed instrument Ì Record naturally available data.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Reliability Lesson Six
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Principles of Test Construction
Reliability Chapter 3. Classical Test Theory Every observed score is a combination of true score plus error. Obs. = T + E.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Reliability & Validity
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.
EDU 8603 Day 6. What do the following numbers mean?
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
1 Measurement and Data Collection  What and How?  Types of Scales Nominal Nominal Ordinal Ordinal Interval Interval Ratio Ratio.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Designs and Reliability Assessing Student Learning Section 4.2.
RELIABILITY Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Reliability. Basics of test score theory Each person has a true score that would be obtained if there were no errors in measurement. However, measuring.
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Measurement and Scaling Concepts
1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.
Classroom Assessment Validity And Bias in Assessment.
Evaluation of measuring tools: reliability
The first test of validity
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Classroom Assessment Reliability

Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students. –Consistency within teachers over multiple occasions for students. –Consistency across teachers for the same students. –Consistency across teachers across students.

Three Types of Reliability Stability reliability. Alternate form reliability. Internal consistency reliability.

Stability Reliability –Concerned with the question: Are assessment results consistent over time (over occasions). Think of some examples where stability reliability might be important. Why might test results NOT be consistent over time?

Evaluating Stability Reliability –Test-Retest Reliability. Compute the correlation between a first and later administration of the same test. –Classification-consistency. Compute the percentage of consistent student classifications over time. (Example on next slide). –Main concern is with the stability of the assessment over time.

Example of Classification Consistency Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed Middle 3 ed Lower 3 ed

Example of Classification Consistency (Good Reliability) Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 3552 Middle 3 ed 4326 Lower 3 ed 1338

Example of Classification Consistency (Poor Reliability) Test-Retest Reliability Classification Table 2 nd Administration of Test 1 st Admin.Upper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed Middle 3 ed Lower 3 ed

Alternate-form Reliability Are two, supposedly equivalent, forms of an assessment in fact actually equivalent? –The two forms do not have to yield identical scores. –The correlation between two or more forms of the assessment should be reasonably substantial.

Evaluating Alternate-form Reliability Administer two forms of the assessment to the same individuals and correlate the results. Determine the extent to which the same students are classified the same way by the two forms. Alternate-form reliability is established by evidence, not by proclamation.

Example of Using a Classification Table to Assess Alternate-Form Reliability Alternate-Form Reliability Classification Table Good ReliabilityForm B Form AUpper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 621 Middle 3 ed 172 Lower 3 ed 037

Example of Using a Classification Table to Assess Alternate-Form Reliability Alternate-Form Reliability Classification Table Poor ReliabilityForm B Form AUpper 3 ed Middle 3 ed Lower 3 ed Upper 3 ed 324 Middle 3 ed 243 Lower 3 ed 235

Internal Consistency Reliability Concerned with the extent to which the items (or components) of an assessment function consistently. To what extent do the items in an assessment measure a single attribute? For example, consider a math problem-solving test. To what extent does reading comprehension play a role? What is being measured?

Evaluating Internal Consistency Reliability Split-Half Correlations. Kuder-Richardson Formua (KR20). –Used with binary-scored (dichotomous) items. –Average of all possible split-half correlations. Cronbach’s Coefficient Alpha. –Similar to KR20, except used with non-binary scored (polytomous) items (e.g., items that measure attitude.

Reliability Components of an Observation O = T + E Observation = True Status + Error.

Standard Error of Measurement Provides an index of the reliability of an individual’s score. The standard deviation of the theoretical distribution of errors (i.e. the E’s). The more reliable a test, the smaller the SEM.

Sources of Error in Measurement Individual characteristics –Anxiety –Motivation –Health –Fatigue –Understanding (of task) –“Bad hair day” External characteristics –Directions –Environmental disturbances –Scoring errors –Observer differences/biases –Sampling of items

Things to Do to Improve Reliability Use more items or tasks. Use items or tasks that differentiate among students. Use items or tasks that measure within a single content domain. Keep scoring objective. Eliminate (or reduce) extraneous influences Use shorter assessments more frequently.

End