1 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Kevin King (WestEd) Roy Beven (NWEA)

Slides:



Advertisements
Similar presentations
An Introduction to Test Construction
Advertisements

Implications and Extensions of Rasch Measurement.
Measures of Academic Progress
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
Assessment and Accountability at the State Level NAEP NRT (Iowa Tests) Core CRTs DWA UAA UBSCT NCLB U-PASS Alphabet Soup.
Common Core Standards and the Edmonds School District November 4, 2013.
Five Things to Get Excited About!. Add It Up 1.Literacy is a shared responsibility.
Robyn Seifert February 6,  smarterbalanced.org  K-12 EDUCATION  Administrators 2.
1 Judy W. Park, Ed.D., Associate Superintendent, Utah State Office of Education.
Fall 2014 MAP NWEA Score Comparison Alliance Dr. Olga Mohan High School October 22, 2014.
Measures of Academic Progress (MAP) Curt Nath Director of Curriculum Ocean City School District.
CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction 2014 Assessment and Accountability Information Meeting Smarter.
WHAT IS MAP? AMES Lunch and Learn.
NEXT GENERATION BALANCED ASSESSMENT SYSTEMS ALIGNED TO THE CCSS Stanley Rabinowitz, Ph.D. WestEd CORE Summer Design Institute June 19,
Smarter Balanced Assessment Consortium A Peek at the Assessment System 1 Rachel Eifler January 30, 2014.
Parent Training California Assessment for Student
Open Source Innovations for Better Assessment Brandt Redd CTO NCSA 22 June 2015.
Getting Ready for MAP/EOC News from DESE & SBAC  17 states (including Missouri) voted on Nov. 14 to accept the Achievement Levels/Scale Scores.
CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction Butte County Office of Education September 19, 2014 Interim.
Ensuring State Assessments Match the Rigor, Depth and Breadth of College- and Career- Ready Standards Student Achievement Partners Spring 2014.
Common Core State Standards (CCSS) September 12, 2012.
Benchmark Data. World History Average Score: 56% Alliance: 96%
Cutler Middle School February 4, 2014 Cutler Middle School February 4, 2014.
Smarter Balanced Assessment Consortium Building a System to Support Improved Teaching and Learning Joe Willhoft Shelbi Cole Juan d’Brot National Conference.
Assessing The Next Generation Science Standards on Multiple Scales Dr. Christyan Mitchell 2011 Council of State Science Supervisors (CSSS) Annual Conference.
CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction Santa Clara COE Assessment Accountability Network September.
May 13, 2011 Getting to Know the Common Core State Standards (CCSS)
Based on Common Core.  South Carolina will discontinue PASS testing in the year This will be a bridge year for common core and state standards.
Guide to Test Interpretation Using DC CAS Score Reports to Guide Decisions and Planning District of Columbia Office of the State Superintendent of Education.
Gary W. Phillips American Institutes for Research United States Department of Education Public Hearings December 1, 2009, Denver, Colorado.
Partnering to help all kids learn MAP Reports and Resources For Parents An Introduction to the MAP® K – 12 Computer Adaptive Interim Assessment.
NATIONAL CONFERENCE ON STUDENT ASSESSMENT JUNE 22, 2011 ORLANDO, FL.
MAP: Measured Academic Progress© Parent Coffee February 10, 2010.
Practical Issues in Computerized Testing: A State Perspective Patricia Reiss, Ph.D Hawaii Department of Education.
Understanding the 2015 Smarter Balanced Assessment Results Assessment Services.
NEXT GENERATION SCIENCE LARGE- SCALE ASSESSMENT IDEAS AND DISCUSSION 1 ST WEBINAR HOSTED BY WESTED FACILITATORS: KATIE BOWLER AND KEVIN KING JULY 7, 2014.
Math Performance Tasks: Scoring & Feedback Smarter Balanced Professional Development for Washington High-need Schools University of Washington Tacoma Maria.
Smarter Balanced Assessment Consortium (SBAC) Fairfield Public Schools Elementary Presentation.
Understanding RIT and Reading MAP Reports. Agenda Unique features of the RIT scale Calibrating items for MAP Scoring a test Interpretation of scores How.
Northwest Evaluation Association – Measure of Academic Progress.
Common Core State Standards Board Study Session October 30, 2013 LVUSD.
MAP Growth NWEA Northwest Evaluation Association.
Coachella Valley Unified School District
M.A.P. Measures of Academic Progress
Understanding the Smarter Balanced Assessment Results
What is a CAT? What is a CAT?.
It Begins With How The CAP Tests Were Designed
NWEA-MAP, IABs, ICAs Who’s on first?.
M.A.P. Measures of Academic Progress
Measures of Academic Progress (MAP) – Overview
NWEA Measures of Academic Progress (MAP)
Language Arts Assessment Update
Item pool optimization for adaptive testing
Create a Strong Start ACT® Aspire ®.
Partial Credit Scoring for Technology Enhanced Items
Smarter Balanced Assessment
Shasta County Curriculum Leads November 14, 2014 Mary Tribbey Senior Assessment Fellow Interim Assessments Welcome and thank you for your interest.
Brian Gong Center for Assessment
Mohamed Dirir, Norma Sinclair, and Erin Strauts
Smarter Balanced Scoring (AKA the “Marble Slides”)
Smarter Balanced Assessments
Innovative Approaches for Examining Alignment
Presentation transcript:

1 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Kevin King (WestEd) Roy Beven (NWEA)

2 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Agenda 1.Experience with CATs 2.CAT Overview: What and Why 3.Longitudinal Scale 4.Nature of CATs and their Item Banks 5.Using different item types in CATs 6.Discussion: Implications for NGSS?

3 Kevin King: HS Biology, Integrated Science, Research Methods Teacher (9 years) Science Assessment Specialist for UT State ( ) Assessment Development Coordinator for UT State ( ) Senior Assessment Manager for WestEd (2012- present) Roy Beven: HS Physics, Math, Geology, Tech-Ed Teacher (23 years) Lead Science Assessment Specialist for WA State ( ) Senior Science Content Specialist for NWEA (2008- present) CSSS Large Scale Assessment Webinar Adaptive Testing in Science

4 Presenters’ Experience with CAT Utah: peer review acceptance of Utah Adaptive Assessment System Smarter Balanced: state co-chair work group member for item development program management liaison for multiple work groups MAP ® for Science: an interim adaptive test designed to measure growth administered last year to over 1.7 million students mostly in grades 3-8 across the nation

5 CSSS Large Scale Assessment Webinar Adaptive Testing in Science CAT Overview: What Tests are designed to assess the performance of students by locating them on a scale with a high degree of accuracy and precision. A computer algorithm selects items according to where the student was last on the scale or some other criteria. When the student answers an item correctly, the computer selects an item higher on the scale and vice versa. The computer selects items until all the criteria are met.

6 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Student 1 Student 2 Student 3 Student 4 Spell “School” Correct IncorrectStudent 1 Student 2 Student 3 Student 4 Spell “Encyclopedia” Student 1Student 2 Spell “Red” Student 3Student 4 Correct Incorrect Correct Incorrect CAT Overview: What

7 CSSS Large Scale Assessment Webinar Adaptive Testing in Science CAT Overview: What (continued) Possible criteria for item selection (aka, CAT blueprint): Student grade range (i.e., blueprinted standards) Number of items (i.e., operational and field test) Claims being reported (e.g., 3 or 4 disciplines) Standard Error of Measurement (SEM) Adequate coverage of standards Adequate cognitive complexity (DoK) Adequate types of items

8 CSSS Large Scale Assessment Webinar Adaptive Testing in Science CAT Overview: What (continued) Constraining a CAT

9 CSSS Large Scale Assessment Webinar Adaptive Testing in Science CAT Overview: What (continued) Sample Test Design with only 3 Criteria - 3 reporting goals (e.g., life, earth/space, physical) - 30 operational items (10 per goal) - SEM Items to balance the number of items per goal Items 1-10 to establish preliminary score Items 26 to 30 to establish the SEM

10 CSSS Large Scale Assessment Webinar Adaptive Testing in Science CAT Overview: Why (continued) Tests present an individually tailored set of questions to each student. Tests can quickly identify which skills students have mastered. Tests provide accurate scores for all students across the full range of the achievement continuum. SBAC adaptive-testing/ adaptive-testing/ CATs have been found to be as accurate as fixed-form tests that are twice as long. CATs drawing from large item pools can provide much more information, and more precise information, than fixed-form tests. CATs provide immediate feedback to students and teachers. ASCD Adaptive-Assessment.aspx Adaptive-Assessment.aspx

11 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Longitudinal Scale Many existing LSA’s develop a new scale for each grade level test each year, then equate these new scales back to the scale established when the tests were first administered. CATs establish one scale. Items are calibrated onto this one scale for the life of the test. The scale could be re- established, but this would affect all items in the item bank.

12 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Nature of CATs and their Item Banks Larger than static test item banks (typically 4-10 times larger). Last longer than static test item banks, as individual item exposure is limited. Need to cover the “range of the algorithm” criteria (e.g., DoK, standards) at a range of item difficulty. Do not fully know the range of difficulty until after items are field tested. A challenge in building a bank at the onset of a new test.

13 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Using different item types in CATs 1.Multiple Choice dichotomously scored items 2.Technology Enhanced Items (TEI’s) 3.Polytomously score items 4.Constructed response items 5.Common Stimulus Item Sets (CSIS) 6.Simulations with scoring by path (PhET, SimSci, NAEP) 7.Others?

14 CSSS Large Scale Assessment Webinar Adaptive Testing in Science Discussion: Implications for LSA of NGSS? What part of a state’s NGSS assessment system could (might) be a CAT? Can (should) a single CAT measure all grade ranges? K-2, 3-5, 6-8, 9-12 Can (should) a CAT report on the 3 dimensions of the NGSS (DCI’s, SEP’s, and CC’s)? Can a CAT report on the 4 disciplines of the NGSS? Can a CAT report on an adequate range of NGSS PE’s?