Assessment Literacy: Test Purpose and Use

Slides:

Advertisements

Similar presentations

Establishing Performance Standards for PARCC Assessments Initial Discussion PARCC Governing Board Meeting April 3,

Advertisements

Learning Objectives, Performance Tasks and Rubrics: Demonstrating Understanding and Defining What Good Is Brenda Lyseng Minnesota State Colleges.

Assessment: Reliability, Validity, and Absence of bias

C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,

Principles of High Quality Assessment

Grade 12 Subject Specific Ministry Training Sessions

Understanding Validity for Teachers

ERIKA HALL CENTER FOR ASSESSMENT PRESENTATION AT THE 2014 NATIONAL CONFERENCE ON STUDENT ASSESSMENT NEW ORLEANS JUNE 25, 2014 The Role of a Theory of Action.

Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.

Predicting Patterns: Lenawee County's Use of EXPLORE and PLAN DataDirector 2011 User Conference Dearborn, Michigan.

Interim Assessments: Do You Know What You Are Buying and Why? Scott Marion, Center for Assessment Imagine: Mathematics Assessment for Learning A Convening.

EDU 385 Education Assessment in the Classroom

CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria.

Performance-Based Assessment HPHE 3150 Dr. Ayers.

Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

What Are the Characteristics of an Effective Portfolio? By Jay Barrett.

The Achievement Chart Mathematics Grades Note to Presenter:

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Student Growth Goals for Coaching Conversations. Requirements for Student Growth Goals Time line Reference to an Enduring Skill Proficiency and Growth.

Smarter Balanced & Higher Education Cheryl Blanco Smarter Balanced Colorado Remedial Education Policy Review Task Force August 24, 2012.

Presentation to the Nevada Council to Establish Academic Standards Proposed Math I and Math II End of Course Cut Scores December 22, 2015 Carson City,

California Assessment of STUDENT PERFORMANCE and PROGRESS

You Can’t Afford to be Late!

Nuts and Bolts of Assessment

Assessment of Learning 1

Consider Your Audience

An Overview of the EQuIP Rubric for Lessons & Units

Smarter Balanced Assessment Results

Reading, Writing & Math at Grade Level

ASSESSMENT OF STUDENT LEARNING

Classroom Assessment Validity And Bias in Assessment.

Week 3 Class Discussion.

2015 PARCC Results for R.I: Work to do, focus on teaching and learning

Bursting the assessment mythology: A discussion of key concepts

Chapter Six Training Evaluation.

Validating Interim Assessments

The purposes of grading student work

Office of Education Improvement and Innovation

An Overview of the EQuIP Rubric for Lessons & Units

Implementing the Specialized Service Professional State Model Evaluation System for Measures of Student Outcomes.

Building Understanding of the North Carolina Professional Teaching Standards How to READ the North Carolina Teacher Evaluation Rubric Using language and.

Assessing the Common Core Standards

Standard Setting for NGSS

Standards and Assessment Alternatives

Developing 21st Century Classrooms: Connecting the Dots IV

Standards-Based Grading

Timeline for STAAR EOC Standard Setting Process

Student Assessment and Evaluation

Brian Gong Center for Assessment

Texas Teacher Evaluation and Support System (T-TESS)

Assessments: Beyond the Claims

Exploring Assessment Options NC Teaching Standard 4

Selecting Baseline Data and Establishing Targets for Student Achievement Objectives Module Welcome to the Polk County Selecting Baseline Data and.

Unit 7: Instructional Communication and Technology

EPAS Educational Planning and Assessment System By: Cindy Beals

Links for Academic Learning: Planning An Alignment Study

Eloise Forster, Ed.D. Foundation for Educational Administration (FEA)

Driving Through the California Dashboard

Standardization in Perkins: Data Quality Institute

Georgia’s Changing Assessment Landscape

TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1

Office of Strategy, Innovation and Performance

Assessment Practices in a Balanced Assessment System

Student Assessment and Evaluation

Welcome Reporting: Individual Student Report (ISR), Student Roster Report, and District Summary of Schools Report Welcome to the Reporting: Individual.

Student Learning Objectives (slos)

Claim 1: Concepts and Procedures

Presentation transcript:

Assessment Literacy: Test Purpose and Use Erika Hall Center for Assessment Reidy Interactive Lecture Series September 29, 2016

Factors Influenced by P&U Test Purpose & Use Test Design Validation Evaluation CLAIMS In the program summary we talk about assessment literacy as the ability of educators and others to use and interpret assessment information well. One key aspect of assessment literacy is understanding the how and why a test’s purpose and the intended use of assessment results influences an assessment’s design as well as the evidence necessary to support evaluation and validation activities. Many might think that understanding this relationship (or this component of assessment literacy) is really something that only psychometricians or those with technical expertise must be aware of, however, that is not true – especially when you look beyond large-scale, high stakes summative assessments. While it is true that the test developer is always responsible for designing a test that meets it’s specified goals or uses; it often falls to district/school leaders or educators to evaluate the quality of one or more assessments and select one that will meet their needs. Especially when we are talking about assessments for which the results are intended to be used to support instruction, program evaluation. In addition there is often a desire to use assessments for multiple purposes. It is necessary to understand when this may be appropriate and the evidence necessary to support it. So why does the purpose of intended use influence these factors? There are a variety of reasons different uses have different stakes associated with them which influence how we evaluated the quality and sufficiency of evidence provided. Different uses prioritize different information (e.g., instructional, diagnostic, summative). While we can’t discuss all of the how’s and why underlying the relationship between test purpose/use and these elements we can address a key aspect of the why? Why does purpose and use matter? Because different test uses necessitate the data being able to support different types of claims! Test Purpose and Use 4/13/2019

Claims Claims are the statements or inferences you want to make about a student, educator or school in light of observed test performance in order to support use of test scores as intended. They can be written at different levels of specificity/granularity The student has mastered proportional reasoning skills The student is proficient in Grade 3 mathematics reflect different types or categories of inferences, such as indicating: Attainment of expected knowledge and skills Readiness for future activities or courses Likelihood of success And be specific to different users Students, teachers, schools, districts, state Test Purpose and Use 4/13/2019

Claims (cont.) Claims may be based on the obtained data or transformations/aggregations of that data. Direct use of test score: Student has met the proficiency standard within the assessed content domain. Once removed from test score: Student has shown growth beyond expectation in the content area. (claim based on an application of the test score) Twice removed from test score : The teacher is supporting student learning at the level expected. (aggregate of transformations of data – SGPs and VAM) For example, to support the use of data for school accountability test scores may need to support claims regarding a student’s attainment of the knowledge skills within the content area, as well as claims regarding student growth. Educator Evaluations 4/13/2019

Claims (cont.) How claims are articulated depends both on how results are intended to be used and what you feel you need to be able to say in order to use them in that manner. THEORY OF ACTION!!! Claims that are too broad in scope will not be useful to support Instruction of a particular area, whle claims that are too narrow are not likely to support acccountability purposes. Test Purpose and Use 4/13/2019

Example – Student Accountability Specific Use User Test-Based Claim Claim Category Determining eligibility for promotion to the next grade School The student has acquired the pre-requisite skills in the content area/domain necessary to be ready for the subsequent work in the content area. Readiness Determining the need for remedial college courses within a particular content area Higher Ed. The student is demonstrating a level of proficiency in the content domain necessary to be successful in college credit bearing courses. Predictive Assign a course grade Teacher The student has attained the knowledge and skills defined as representing proficiency within the content area/domain Attainment Determine Eligibility for Graduation State or District Any of these three claims may support this use. For example – this table shows how different uses may be supported by different claims. But this table also shows that there is not a one to one correspondence between a use and a claim. For example three states may desire to use test results to determine eligibility for graduation, but they may all have different beliefs regarding the specific claim that must be supported by the test results in order to do that. So why do claims matter? Test Purpose and Use 4/13/2019

Claims to support purpose/use Why do Claims Matter??? Test Purpose and Use Claims to support purpose/use Assessment Design Validation So why are claims important? They not only drive the assessment design, but also provide the foundation for validation. They determine what evidence needs to be collected and are a key element of evaluating test quality. Evidence Test Purpose and Use 4/13/2019

Assessing Quality The categories of evidence necessary to support the evaluation of an assessment may be consistent across these purposes and uses: Construct representation Reliability/precision Comparability Fairness Performance Standards Relationship to external variables Administration and Reporting However, how that evidence is prioritized or the criteria defining what is appropriate/good enough will vary depending on the claim you are making and the nature of the intended use. Similarly, the claim you are making to support the intended use of a test result will influence the evidence necessary to support it. Test Purpose and Use 4/13/2019

Different Claim – Same Use Key Evidence to support this claim A The student is demonstrating a level of proficiency in the content domain necessary to be successful in college credit bearing courses. Performance Standards: Cut-scores were established in consideration of evidence/expectations representing what it means to be "successful" in non-remedial, college credit bearing courses. Validity: Students meeting the cut-score are ultimately "successful" in college bearing courses. B The student is Proficient in the assessed content areas. Performance Standards: the standard setting process was conducted represent thresholds between clearly defined knowledge and skill-based proficiency expectations. Validity: the items that Proficient students answer correctly reflect the expectations defined within the Proficient performance level descriptor. Consider 2 states, each of which are use student performance on the state summative Grade 11 Math test as a requirement for graduation. The evidence necessary to support the claim that a student is “ready for post-secondary” is different than the evidence necessary to support the claim that a student has acquired the state’s grade-level content standards. Similarly, the evidence necessary to support claims that a student requires remediation/targeted instruction in Grade 4 proportional reasoning skills is different from the evidence necessary to support claims that they are on-track to be college ready by the end of high school. Test Purpose and Use 4/13/2019

Different Use – Same Claim The student is reading on grade-level Support decisions regarding promotion to the next grade Identify which reading group a student should be in during classroom instruction. Appropriateness or quality of evidence necessary to support a given claim depends on how the assessment will be used! How might the way in which evidence is prioritized/considered differ in these two use cases? Test Purpose and Use 4/13/2019

Closing Thoughts A test is only “good” to the extent that it provides data and information that supports it’s intended purpose and use. To support validation one must be clear as to the claim they are making to support a given use. To support an evaluation of assessment quality one must consider both the claim being made in combination with the intended use. The extent to which a test can serve multiple purposes and uses depends upon the degree to which the claims underlying those uses and the evidence necessary to support them are consistent. Test Purpose and Use 4/13/2019

Erika Hall ehall@nciea.org Test Purpose and Use 4/13/2019