Self-Assessing Locally-Designed Assessments Jennifer Borgioli Learner-Centered Initiatives, Ltd.

Slides:

Advertisements

Similar presentations

Writing constructed response items

Advertisements

Performance Assessment

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.

Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.

Part II Sigma Freud & Descriptive Statistics

Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.

Designs to Estimate Impacts of MSP Projects with Confidence. Ellen Bobronnikov March 29, 2010.

Effect Size and Meta-Analysis

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Alternative Assessments FOUN 3100 Fall 2003 Sondra M. Parmer.

TWS Aid for Scorers Information on the Background of TWS.

The mere imparting of information is not education. Above all things, the effort must result in helping a person think and do for himself/herself. Carter.

Assessment: Reliability, Validity, and Absence of bias

Teaching and Testing Pertemuan 13

Grade 12 Subject Specific Ministry Training Sessions

Classroom Assessment A Practical Guide for Educators by Craig A

Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.

Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.

Technology subjects (7-12) Consistency of moderation.

Technical Issues Two concerns Validity Reliability

Performance-Based Assessment June 16, 17, 18, 2008 Workshop.

SEPT 20 8:00-11:00 WHAT ARE WE MEASURING? HOW DO WE MEASURE? DHS English Department Professional Development.

Session A: Psychometrics 101: The Foundations and Terminology of Quality Assessment Design Date: May 14 th from 3-5 PM Session B: Psychometrics 101: Test.

Principles of Assessment

The New England Common Assessment Program (NECAP) Alignment Study December 5, 2006.

Student Growth Measures in Teacher Evaluation Module 2: Selecting Appropriate Assessments 1.

PARCC Information Meeting FEB. 27, I Choose C – Why We Need Common Core and PARCC.

Bank of Performance Assessment Tasks in English

Authentic Assessment Principles & Methods

Overall Teacher Judgements

Reliability Presented By: Mary Markowski, Stu Ziaks, Jules Morozova.

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Technical Adequacy Session One Part Three.

Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.

Classroom Assessments Checklists, Rating Scales, and Rubrics

Classroom Assessment A Practical Guide for Educators by Craig A

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

What is the TPA? Teacher candidates must show through a work sample that they have the knowledge, skills, and abilities required of a beginning teacher.

Teaching Today: An Introduction to Education 8th edition

Assessment Professional Learning Module 5: Making Consistent Judgements.

Chap. 2 Principles of Language Assessment

Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.

1 Assessment Professional Learning Module 5: Making Consistent Judgements.

Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.

Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.

PORTFOLIO ASSESSMENT OVERVIEW Introduction  Alternative and performance-based assessment  Characteristics of performance-based assessment  Portfolio.

Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.

Evaluating Impacts of MSP Grants Ellen Bobronnikov January 6, 2009 Common Issues and Potential Solutions.

Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov February 16, 2011.

APA NJ APA Teacher Training 2 What is the Purpose of the APA? To measure performance of students with the most significant cognitive disabilities.

Performance Based Assessment. What is Performance Based Assessment? PBA is a form of assessment that requires students to perform a task rather than an.

Chapter 6 - Standardized Measurement and Assessment

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.

Reliability EDUC 307. Reliability  How consistent is our measurement?  the reliability of assessments tells the consistency of observations.  Two or.

TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.

5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)

You cannot be serious ?!? Question #5. Intent of Question The primary goals of this question were to assess a student’s ability to (1)calculate proportions.

RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.

Designing Quality Assessment and Rubrics

Quality Assurance processes

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.

Lecture 5 Validity and Reliability

Reliability Module 6 Activity 5.

Reliability & Validity

Validity and Reliability II: The Basics

jot down your thoughts re:

AACC Mini Conference June 8-9, 2011

jot down your thoughts re:

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

Self-Assessing Locally-Designed Assessments Jennifer Borgioli Learner-Centered Initiatives, Ltd.

Handouts qualityrubrics.pbworks.com/DATAG

Organizational Focus Assessment to produce learning… and not just measure learning.

“Less than 20% of teacher preparation programs contain higher level or advanced courses in psychometrics (assessment design) or instructional data analysis.” Inside Higher Education, April 2009

To be assessment savvy….

1999 APA Testing Standards

“The higher the stakes of an assessment’s results, the higher the expectation for the documentation supporting the assessment design and the decisions made based on the assessment results.”

Performance-Based Assessments (PBAs) A performance task is an assessment that requires students to demonstrate achievement by producing an extended written or spoken answer, by engaging in group or individual activities, or by creating a specific product. (Nitko, 2001)

Three Types of Measurement Error Subject effect Test effect Environmental effects

Subject Effects

Testing Fatigue Test Familiarity Bias Score

Test Effects

Final Eyes isn’t about editing rather “is this what you want the students to see/read?”

Test from Period 1 Test from Period 2

Compare with...

Environmental Effects

Reliability = Consistency

Reliability Indication of how consistently an assessment measures its intended target and the extent to which scores are relatively free of error. Low reliability means that scores cannot be trusted for decision making. Necessary but not sufficient condition to ensure validity.

three general ways to collect evidence of reliability Stability: How consistent are the results of an assessment when given at two time-separated occasions? Alternate Form: How consistent are the results of an assessment when given in two different forms?; Internal Consistency: How consistently do the test’s items function?

Cronbach’s Alpha “In statistics, Cronbach's (alpha) is a coefficient of reliability. It is commonly used as a measure of the internal consistency or reliability of a psychometric test score for a sample of examinees. Alpha is not robust against missing data.”

Item Analysis “This isn’t familiar to me”

Percent of Students Selecting Choice “E”

Validity = Accuracy

How do we ensure alignment and validity in assessment? Degrees of Alignment Strong The assessment/learning activity clearly aligns to the target; the assessment/activity and the target are almost one in the same. The language of the standard is explicit. You could confidently infer or conclude the level of student learning/understanding for the target. Moderate The assessment/learning activity addresses the target; the target is included in the learning experience but is not the primary focus. The language of the standard is only partially used. You would need an additional data point to confidently infer the level of student learning/understanding for the target. Weak The assessment/activity misses the target; it might prepare kids for the target, but doesn’t address it. You could not assess level of student learning/understanding for the target.

If you want to assess your students’ ability to perform, design, apply, interpret then assess them with a performance or product task that requires them to perform, design, apply, or interpret.

How many? – 5 standards in a PBA (reflected in rows in the rubric) 3 – 5 items per standard on a traditional test

Minimum

Basic

Articulated

One assessment does not an assessment system make.

Fairness and Bias Fair tests are accessible and enable all students to show what they know. Bias emerges when features of the assessment itself impede students’ ability to demonstrate their knowledge or skills.

In 1876, General George Custer and his troops fought Lakota and Cheyenne warriors at the Battle of the Little Big Horn. In there had been a scoreboard on hand, at the end of that battle which of the following score-board representatives would have been most accurate? A.Soldiers > Indians B.Soldiers = Indians C.Soldiers < Indians D.All of the above scoreboards are equally accurate

What are other attributes of quality assessments?

Standard Error of Measurement An estimate of the consistency of a student’s score if the student had retaken the test innumerable times

How is the SEM calculated?: The SEM is calculated by dividing the SD by the square root of N. This relationship is worth remembering, as it can help you interpret published data. Calculating the SEM with Excel Excel does not have a function to compute the standard error of a mean. It is easy enough to compute the SEM from the SD, using this formula. =STDEV()/SQRT(COUNT()) For example, if you want to compute the SEM of values in cells B1 through B10, use this formula: =STDEV(B1:B10)/SQRT(COUNT(B1:B10)) The COUNT() function counts the number of numbers in the range. If you are not worried about missing values, you can just enter N directly. In that case, the formula becomes: =STDEV(B1:B10)/SQRT(10)

WHEN DESIGNING A PRE/POST PERFORMANCE TASK the standards and thinking demands must stay the same. the modality that students express their thinking through must also stay the same. the content of the baseline and post must be different. the rubrics for the pre/post will be the same in terms of thinking and modality, but the content dimension will be different.

Jennifer