Gary W. Phillips American Institutes for Research United States Department of Education Public Hearings December 1, 2009, Denver, Colorado.

Slides:

Advertisements

Similar presentations

The Journey – Improving Writing Through Formative Assessment Presented By: Sarah McManus, Section Chief, Testing Policy & Operations Phyllis Blue, Middle.

Advertisements

North Carolina Educator Evaluation System. Future-Ready Students For the 21st Century The guiding mission of the North Carolina State Board of Education.

ESEA Reauthorization and Waivers AFT Teachers PPC Meeting March 13, 2012 New York, NY.

Assessment and Accountability at the State Level NAEP NRT (Iowa Tests) Core CRTs DWA UAA UBSCT NCLB U-PASS Alphabet Soup.

Iowa Assessment Update School Administrators of Iowa November 2013 Catherine Welch Iowa Testing Programs.

NORTH CAROLINA SCHOOL ACCOUNTABILITY MODEL 1 Superintendents’ Regional Meetings.

STATE STANDARDIZED ASSESSMENTS. 1969The National Assessment for Educational Progress (NAEP) administered for the first time, Florida participated in the.

AzMERIT Arizona’s Statewide Achievement Assessment for English Language Arts and Mathematics November 20, 2014.

SOL Innovation Committee July 15, 2014 Virginia Assessment Program Overview Presentation to the Standards of Learning Innovation Committee July 15, 2014.

On The Road to College and Career Readiness Hamilton County ESC Instructional Services Center Christina Sherman, Consultant.

Chapter Fifteen Understanding and Using Standardized Tests.

Shelda Hale, Title III, ELL and Immigrant Education Kentucky Department of Education.

Jamal Abedi University of California, Davis/CRESST Presented at The Race to the Top Assessment Program January 20, 2010 Washington, DC RACE TO THE TOP.

Common Core State Standards & Assessment Update The Next Step in Preparing Michigan’s Students for Career and College MERA Spring Conference May 17, 2011.

April 11, 2012 Comprehensive Assessment System 1.

The Five New Multi-State Assessment Systems Under Development April 1, 2012 These illustrations have been approved by the leadership of each Consortium.

SMARTER Balanced Assessment Consortium Smarter Balanced Assessment Consortium.

Consortia of States Assessment Systems Instructional Leaders Roundtable November 18, 2010.

NCLB AND VALUE-ADDED APPROACHES ECS State Leader Forum on Educational Accountability June 4, 2004 Stanley Rabinowitz, Ph.D. WestEd

Becoming a Teacher Ninth Edition

Common Core State Standards: Changing the Game Lucille E. Davy, Senior Advisor June 27, 2011.

NEXT GENERATION BALANCED ASSESSMENT SYSTEMS ALIGNED TO THE CCSS Stanley Rabinowitz, Ph.D. WestEd CORE Summer Design Institute June 19,

CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction California Assessment Update California Mathematics Council.

Looking Back – Looking Forward Implementation of the Assessment System Presentation to: California Institute for School Improvement (CISI) Sacramento –

TOM TORLAKSON State Superintendent of Public Instruction 1 Common Core State Standards specify K-12 expectations for college and career readiness Common.

Exploring Alternate AYP Designs for Assessment and Accountability Systems 1 Dr. J.P. Beaudoin, CEO, Research in Action, Inc. Dr. Patricia Abeyta, Bureau.

Florida’s Implementation of NCLB John L. Winn Deputy Commissioner Florida Department of Education.

CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria.

Smarter Balanced Assessment Consortium Building a System to Support Improved Teaching and Learning Joe Willhoft Shelbi Cole Juan d’Brot National Conference.

1 Watertown Public Schools Assessment Reports 2010 Ann Koufman-Frederick and Administrative Council School Committee Meetings Oct, Nov, Dec, 2010 Part.

1 Race to the Top Assessment Program General & Technical Assessment Discussion Jeffrey Nellhaus Deputy Commissioner January 20, 2010.

CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction Enhanced Assessment Grant: English Language Proficiency Assessment.

Assessing The Next Generation Science Standards on Multiple Scales Dr. Christyan Mitchell 2011 Council of State Science Supervisors (CSSS) Annual Conference.

CALIFORNIA DEPARTMENT OF EDUCATION Tom Torlakson, State Superintendent of Public Instruction Santa Clara COE Assessment Accountability Network September.

Smarter Balanced Assessment System March 11, 2013.

Raising the Bar for Oregon. Adopt New Math Cut Scores and Final Math Achievement Level Descriptors and Policy Definitions Adopt High School Math Achievement.

Race to the Top Assessment Program General & Technical Discussion Lizanne DeStefano University of Illinois.

MEAP / MME New Cut Scores Gill Elementary February 2012.

Race to the Top General Assessment Session Atlanta, Georgia November 17, 2009 Louis M. (Lou) Fabrizio, Ph.D. Director of Accountability Policy & Communications.

Race to the Top Assessment Program: Public Hearing on High School Assessments November 13, 2009 Boston, MA Presenter: Lauress L. Wise, HumRRO Aab-sad-nov08item09.

Race to the Top Assessment Program: Public Hearing on Common Assessments January 20, 2010 Washington, DC Presenter: Lauress L. Wise, HumRRO Aab-sad-nov08item09.

Changes in Professional licensure Teacher evaluation system Training at Coastal Carolina University.

Moving Forward With Assessment and Accountability August 2011 High School.

Gary W. Phillips American Institutes for Research CCSSO 2014 National Conference on Student Assessment (NCSA) New Orleans June 25-27, 2014 Multi State.

1 COMMON CORE STATE STANDARDS Assessments based on the Common Core State Standards Vince Dean, Ph.D. Office of Educational Assessment & Accountability.

Understanding the 2015 Smarter Balanced Assessment Results Assessment Services.

Race to the Top Assessment Competition Public & Expert Input Meetings Procurement Washington, DC January 14, 2010.

CCSSO Task Force Recommendations on Educator Preparation Idaho State Department of Education December 14, 2013 Webinar.

Race to the Top Assessment Competition Public & Expert Input Meetings Boston, MA November 12-13, 2009.

29 States $176,000,000 for development Includes formative, interim & summative Governed and controlled by states Co-chairs, Judy Park, Utah; Tony Alpert,

N EW S TUDENT A SSESSMENT P ROGRAM Dr. Kenneth P. Oliver Macon County Schools’ Fall Leadership Retreat November 15, 2013.

Minnesota Assessments June 25, 2012 Presenter: Jennifer Dugan.

Smarter Balanced & Higher Education Cheryl Blanco Smarter Balanced Colorado Remedial Education Policy Review Task Force August 24, 2012.

Next Generation Iowa Assessments.  Overview of the Iowa Assessments ◦ Purpose of Forms E/F and Alignment Considerations ◦ Next Generation Iowa Assessments.

Breakout Discussion: Every Student Succeeds Act - Scott Norton Council of Chief State School Officers.

Next Generation Iowa Assessments

Standardized Test Overview

Student Growth Measurements and Accountability

Language Arts Assessment Update

2015 PARCC Results for R.I: Work to do, focus on teaching and learning

Validating Interim Assessments

Common Core Update May 15, 2013.

Standard Setting for NGSS

SAT and Accountability Evidence and Information Needed and Provided for Using Nationally Recognized High School Assessments for ESSA Kevin Sweeney,

Introduction to the WIDA Consortium

Understanding and Using Standardized Tests

State Assessment Update

Presentation transcript:

Gary W. Phillips American Institutes for Research United States Department of Education Public Hearings December 1, 2009, Denver, Colorado

 The goals of the next generation assessment system envisioned by the Race to the Top cannot be reached with our existing testing paradigm.  Our existing system of state assessments are ◦ uncoordinated ◦ non-comparable ◦ non-aggregatable ◦ non-scalable ◦ too expensive ◦ too slow Gary W. Phillips2

1. Common standards 2. Computer-adaptive tests 3. Better Measures of Growth Gary W. Phillips3

 Common content standards in each state consortium that are internationally competitive and lead to high school graduates who are ready for well paying careers and postsecondary schooling.  Common item bank (developed by teachers across the consortium), common test blueprints, and each state would administer comparable tests that are equated to the consortia common scale.  At least 85% of each state test would cover all the consortia common content standards (the other 15% would be state supplements to the common content standards). Gary W. Phillips4

 Common internationally benchmarked proficient standards for each grade (that are comparable across all consortia) are vertically articulated across grades and on a trajectory that leads to high school career-ready and college-ready proficiency. (Difficulty of proficient standard is comparable across all consortia and across all states).  Conventional standard setting methodology would be re-engineered. Current standard setting (e.g., bookmark procedure) is based primarily on content judgments (state impact data are an afterthought and no national or international impact data are typically used). In the new design the common proficient performance standard would be established first through empirical benchmarking. Performance level descriptors would subsequently be written to describe the proficient standard, then the PLD for other standards would be written.  Adequate yearly progress (AYP) would be based on proficient performance standards that are comparable across all consortia and across all states and therefore yields fair state, district and school comparisons. Gary W. Phillips5

6

 The current model of one size fits all (the same paper-pencil test given to all students) provides poor measurement for large portions of the student population. They are too easy for high achieving students and too hard for low achieving students, students with disabilities and English language learners.  Computer-adaptive tests should be encouraged in each consortium. (They already exist in various stages of development in many states, including Delaware, Georgia, Hawaii, Idaho, Maryland, North Carolina, Oregon, South Dakota, Utah, and Virginia).  Cost savings, multiple testing opportunities, immediate feedback, shorter tests.  Formative assessments and interim assessments (intended to improve instruction) would be developed that are aligned with the summative assessment and the common standards.  Constructed-response items (where possible) would also be administered and scored by computer (but validated by teacher hand scoring). Constructed response items and performance tasks that could not be scored by computer would be scored by teachers.  Accommodations would be provided and universal design would be part of the assessment.  Better reliability and more accurate measurement for high and low achieving students and better measurement for students with disabilities and English language learners.  Better validity because the item-selection algorithm can be adaptive as well as standards-based. ◦ At the student level the test can meet the blueprint (e.g., if the blueprint calls for 20% algebra then 20% of the items in the CAT will be algebra). ◦ At the classroom level the test can cover the deeper levels of the content standards (e.g. across the classroom it might cover all sub-objectives). This forces teachers to teach all levels of the content standards for which they will be held accountable. Gary W. Phillips7

 With current growth models we frequently see negative growth for the top students and find our lowest achieving students are the fastest learners. Both of these patterns are usually artifacts of the ceiling and floor effects of our current testing paradigm. These artifacts would be ameliorated by computer-adaptive testing.  Common vertical scale would be needed to measure growth across grades (within each consortium) which would facilitate the measurement of student grade-to- grade growth and the application of student growth models.  Value-added indices and teacher effectiveness measures would be comparable and more accurate.  Statewide longitudinal data system would be required that uses a unique statewide student identifier with student data that are transferable and linked to teachers and schools and maintained throughout K-12.  More reliable measures of growth. Growth measures are inherently less reliable than status measures. However, because computer adaptive testing provides more reliable measures of status they therefore provide more reliable measures of growth. Gary W. Phillips8

Implements the vision of Race to the Top with high quality assessments based on fewer, clearer, higher standards.  Improves NCLB by correcting two of its fundamental problems (too many content standards and too many performance standards).  Scalable to a large number of states by taking advantage of innovation and technology. Better measurement for a wider range of students in the general population, can be implemented in alternate assessments in the 1% population, and eliminates the need for a modified assessment for the 2% population. Feasible and meets all professional and technical standards by the AERA, NCME and APA. Affordable and in the long run would cost about half as much as paper-pencil tests.  Benefits the feds (comparable data for states, districts, schools).  Benefits the states (cheaper, faster, better assessments with some local flexibility). Gary W. Phillips9

The entire assessment system within each consortia would be placed on a vertical scale (e.g., from grade 3 through high school). The vertical scale would reflect the incrementally increasing difficulty of the content standards as the student moves up the grades and would be used to improve the accuracy of student growth models and provide better measures of teacher and principal effectiveness. In addition to a vertical scale, the performance standards would be vertically articulated. For example, the proficient standard would be established in such a way that they would reflect an orderly progression of increasing higher and higher expectations as the student moves up the grades. They would be on an upward trajectory leading to an internationally benchmarked, career-ready and college-ready proficiency standard in high school. Gary W. Phillips10

Each consortia of states would need to fund empirical research on how well the high school test predicts college and career success. Recent work by the National Assessment Governing Board (related to validating the 12 th grade NAEP) would inform this process. The predictive validity studies and an evaluation of the validity of the international benchmarking should be done by an independent group (e.g., the National Academy of Sciences). Gary W. Phillips11

Each state consortia should release enough items each year to thoroughly represent the content standards (this would be around items). Over time, more and more items would be released. The above design depends on a major item development effort. A substantial pool of items would be needed to ◦Adequately cover the content standards. ◦Equate new forms to the common scale with each successive administration. ◦Release enough items to help teachers use the items for teaching and diagnostic purposes.  However, since items would be shared across states within a consortia the cost should be manageable. Gary W. Phillips12