Assessment and Performance Standards How Good is Good Enough? March 4-6, 2008.

Slides:



Advertisements
Similar presentations
1 Organizational Approach to Measuring Student Growth.
Advertisements

Assessment Adapted from text Effective Teaching Methods Research-Based Practices by Gary D. Borich and How to Differentiate Instruction in Mixed Ability.
“PD in a PowerPoint” Standards Based Grading & Reporting (SBGR)
STUDENT GROWTH PART 3 10/16/14 ASSESSMENTS. REVIEW 1.Who are we going to target? 9 th grade 2.How much do they need to grow? Based on MAPS testing 1.5.
© 2012 Common Core, Inc. All rights reserved. commoncore.org NYS COMMON CORE MATHEMATICS CURRICULUM A Story of Units Taking a Look at Rigor.
______________________________ Connecticut Mastery Test Student Information.
Briefing: NYU Education Policy Breakfast on Teacher Quality November 4, 2011 Dennis M. Walcott Chancellor NYC Department of Education.
Kauchak and Eggen, Introduction to Teaching: Becoming a Professional, 3rd Ed. © 2008 Pearson Education, Inc. All rights reserved. 1 Introduction to Teaching:
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
STAAR/EOC Information Meeting. What is the STAAR assessment program? The State of Texas Assessments of Academic Readiness or STAAR The new assessment.
Current legislation requires the phase-out of high school TAKS and replaces it with 12 EOC assessments in  English I, English II, English III  Algebra.
1 New England Common Assessment Program (NECAP) Setting Performance Standards.
Setting Performance Standards Grades 5-7 NJ ASK NJDOE Riverside Publishing May 17, 2006.
State of Texas Assessments of Academic Readiness.
The reform of A level qualifications in the sciences Dennis Opposs SCORE seminar on grading of practical work in A level sciences, 17 October 2014, London.
Wastewater Treatment Plant Operator Exam Setting Performance Standards With The Modified Angoff Procedure.
Measuring Learning Outcomes Evaluation
Standardized Test Scores Common Representations for Parents and Students.
“Educating individuals for a collective future” * With thanks to Crockett High School.
Student Growth Developing Quality Growth Goals STEPS 3-4-5
Institut zur Qualitätsentwicklung im Bildungswesen Dirk Richter Eurasian Educational Dialogue, Jaroslawl Monitoring student achievement in Germany:
CAA’s IBHE Program Review Presentation April 22, 2011.
Fifth Annual NSF Robert Noyce Teacher Scholarship Program Conference July 7-9, 2010 Enrique Ortiz University of Central Florida Using a Teaching Goals.
How well I do is up to me!. What happened to the MEAP test? The MEAP in high school is now called the MME (Michigan Merit Examination) and is made of.
STAAR Overview Desert Hills Elementary School
Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska.
Autonomous Learning Proficiency: Getting students to think about their learning Lynn Grinnell College of Business.
1 Investigating the Standards: K-12 English Language Arts Bruce Bufe, Ann Craig, Kathy Learn, Leigh McEwen, Nicole Peterson, Pat Upchurch, Martha Yerington.
Establishing MME and MEAP Cut Scores Consistent with College and Career Readiness A study conducted by the Michigan Department of Education (MDE) and ACT,
MULTIPLE MEASURES What are they… Why are they… What do we do… How will we know… Dr. Scott P. Myers KLFA Wednesday, August 28, 2013.
Testing Information Session SAGE Testing Information 1 Information for Parents and Schools.
Classroom Assessments Checklists, Rating Scales, and Rubrics
1 New England Common Assessment Program (NECAP) Setting Performance Standards.
 Closing the loop: Providing test developers with performance level descriptors so standard setters can do their job Amanda A. Wolkowitz Alpine Testing.
How to Fail a Student Lisa M. Beardsley-Hardy, PhD, MPH, MBA Director of Education General Conference of Seventh-day Adventists.
CogAT Cognitive Abilities Test ™ Report to Parents What does CogAT measure? CogAT measures cognitive development of a student in the areas of learned reasoning.
CT 854: Assessment and Evaluation in Science & Mathematics
© 2012 Common Core, Inc. All rights reserved. commoncore.org NYS COMMON CORE MATHEMATICS CURRICULUM A Story of Units Module Focus.
Normal Distr Practice Major League baseball attendance in 2011 averaged 30,000 with a standard deviation of 6,000. i. What percentage of teams had between.
Raising the Bar for Oregon. Adopt New Math Cut Scores and Final Math Achievement Level Descriptors and Policy Definitions Adopt High School Math Achievement.
Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.
Learning Targets Helping Students Aim for Understanding in Every Lesson! Part II.
Understanding and Communicating About New Performance Standards on New Performance Standards on Michigan’s Standardized Tests RAISING EXPECTATIONS.
Understanding the 2015 Smarter Balanced Assessment Results Assessment Services.
The READY Accountability Report: Growth and Performance of North Carolina Public Schools State Board of Education November 7, 2013.
Olympia High School is committed to every student’s success. We believe access to rigorous course work such as Advanced Placement® (AP®) plays an important.
Proposed End-of-Course (EOC) Cut Scores for the Spring 2015 Test Administration Presentation to the Nevada State Board of Education March 17, 2016.
Kansas College and Career Ready Academic Assessment OR Kansas Assessment Program (KAP) Results.
Curriculum Night Elementary. What do I as a parent need to know to support student assessments at CCAS? Essential Question.
Presentation to the Nevada Council to Establish Academic Standards Proposed Math I and Math II End of Course Cut Scores December 22, 2015 Carson City,
Senior School Programs. Today’s Session Overview of Senior School programs Specific information on the IB and QCE programs Requirements for.
KHS PARCC/SCIENCE RESULTS Using the results to improve achievement Families can use the results to engage their child in conversations about.
Curriculum Night Middle School. What do I as a parent need to know to support student assessments at CCAS? Essential Question.
ACS WASC/CDE Visiting Committee Final Presentation Panorama High School March
Krum High School Scheduling Rising 12 th Graders Krum High School.
7 Training Employees What Do I Need to Know?
ACA Intermediate March 23, 2017
CREATING A PRE-ALGEBRA COURSE
CLEAR 2011 Annual Educational Conference
Assessments for Monitoring and Improving the Quality of Education
ISM Middle School and High School Program
New Goal Clarity Coach Training October 27, 2017
Do My Grading Practices Support Learning?
Fostering a Community of Learners and Leaders
Standards Aligned System Integration Trainer Professional Development
Discussion and Vote to Amend the Regulations
Cardinal Convo April &
Mastery Based Learning and Grading 101
Donovan Elementary MCAS
Presentation transcript:

Assessment and Performance Standards How Good is Good Enough? March 4-6, 2008

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 1 William Lorié, Ph.D. Director, International R&D CTB/McGraw-Hill

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 2 Agenda Chapter I: So you want to be a decathlete Chapter II: You want me to jump how high? Chapter III: Favorable winds, sun in my face: A detour into human performance Chapter IV: Philosophies for setting the bar

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 3 Chapter I So you want to be a decathlete

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 4

5 How Good Is Good Enough?

© 2008 The McGraw-Hill Companies, Inc. All rights reserved UK level “A” 2000 UK level “A”

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 7 The Standard is Different for the Generalist Needed 8000 for “A” Level Qualification for Decathlon in UK Olympic Team in 2004 At 800 points per event, I can “get by” with a high jump of 2 meters …Or less, if I am relatively strong in other events…

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 8 Chapter II You want me to jump how high?

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 9 Educational Tests Are More Like Decathlons Than High Jumps Student learning outcomes are varied and interlinked At every level of schooling, especially early on, we want students to do well in a number of broad learning outcomes, not just a few Students can be strong overall, weak overall, or strong in some areas and weak in others

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 10 Content and Performance Standards Who’s being Tested? High JumperDecathlete What’s on the Test? High Jump EventTen Different but Related Events What do they need to Pass? Jump 2.3 metersGet 8000 points (Try to high jump at least 2 meters)

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 11 So, How Good is Good Enough? A matter of judgment Takes into consideration that  The test is a sample of tasks that all count toward the final score  It is not essential to master any one given task  Tasks are a sample from a broader domain that we care about – not everything that could have been tested, is tested. Traiacontacaioctacathlon anyone?

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 12 Chapter III Favorable winds, sun in my face: A detour into human performance

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 13 Olympic High Jump Athlete’s Performance Typical (Average) Recent Worst Personal Best World Record

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 14 Olympic High Jump Athlete’s Performance Typical (Average) Recent Worst Personal Best World Record

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 15 All Individual Human Performance has variation… Possible sources of variation High Jumper or Decathlete Student Taking a High School Exit Exam Systematic-Weather conditions -Altitude -Indoors or out -Gender -Time of year -Curriculum -Quality of Instruction Non- systematic -Sharpened focus -Loss of concentration -Muscle fatigue / failure -Lapses in judgment -Moments of insight -Mood

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 16 …and Error Is a Part of All Measurement After you’ve standardized your field conditions, and controlled everything you can think of, you still get variation in individual performance. In measurement, that variation is due entirely to non-systematic sources. Those sources are all lumped together and called Error. Error is a technical concept.

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 17 Where Is There Error in Educational Measurement? The average score of French 8 th graders on TIMSS My college entrance exam scores Diane Lotfi’s 5 th grade standardized achievement test scores The grades I gave my 9 th year students in physical science when I was a teacher Student grade-point averages Throughout your entire recorded academic career

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 18 Don’t Panic In the long run, the Errors average out to zero When it matters most, rigorous steps are taken to quantify and minimize Error

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 19 The Problem of Error and Performance Standards

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 20 Coach, can you give me another chance?

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 21 Chapter IV Philosophies for Setting the Bar

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 22 How Do We Set the Bar? Two Ways: Think of People or Think of Tasks

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 23 The People Approach, Roughly… I know my students well. I can make judgments about whether each has met the bar.  “Have minimal competency in 4 th grade mathematics”  “Merit a high school leaving certificate”  “Are prepared for the next unit of instruction in Arabic” A standardized test is given, and the score that discriminates most highly between the two groups is chosen as the standard.

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 24 In Practical Terms, Most Standard Setting (or Level Setting) Follows the Task Approach

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 25 Some Select Task Approaches Angoff and modifications Ebel Nedelsky Jaeger-Mills Bookmark Body of Work Briefing Book Item-Descriptor Matching

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 26 What They Have in Common Consider items, tasks, or more specifically performances on tasks Rely on concept / abstraction of the minimally qualified individual Most have been generally accepted in the field  Angoff is first invented and most widely used  Bookmark is most popular in achievement testing All have been praised and criticized

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 27 Standard Setting is arguably the most controversial and most consequential of all the areas of educational measurement Why? Variation in results due to method, judges, language of performance standard The cut point sometimes has important consequences for students, teachers, schools, entire systems, reform efforts. That 8000 Can Alter Your Life Plan

Thank you. Questions?

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 29 Group Activity: Modified Angoff Standard Setting You have been convened by the Ministry of a GCC country to establish standards for “Proficiency” in 5 th grade mathematics. Step 1: Discuss the minimally proficient student  His / her knowledge, skills, and abilities Step 2: Review / take a test of 20 mathematics items at the 5 th grade level Step 3: We will give you verbal instructions on how to make Angoff judgments on the items Step 4: You will make one round of judgments and we will provide feedback for you

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 30 Instructions for standard setting judges What is the probability that a minimally Proficient grade 5 student will get this item correct? In a group of 100 minimally Proficient grade 5 students, what percent would you expect to get this item correct? (Convince yourselves that these are equivalent statements.)

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 31 Types of Feedback Table Mean and Dispersion Group Mean and Dispersion Impact

© 2008 The McGraw-Hill Companies, Inc. All rights reserved. 32 What would happen in the real thing? Multiple Rounds Calculation of Level Setting Error Review by Sponsoring Agency Final Decision Implementation Possible Future Review