English Language Development Assessment (ELDA)

Slides:



Advertisements
Similar presentations
Administering K-2 WESTELL. Overview l Purpose and Nature of WESTELL K-2 l Language Acquisition in Young Children l What You Will Need l Administration.
Advertisements

January 22, /25/ STAAR: A New Assessment Model STAAR is a clearly articulated assessment program. Assessments are vertically aligned within.
Arkansas English Language Proficiency Assessment (ELPA) Score Interpretation Arkansas English Language Proficiency Assessment (ELPA) Score Interpretation.
1 New England Common Assessment Program (NECAP) Setting Performance Standards.
Preparing for New Test Scores  Smarter Balanced assessments measure the full range of the Common Core State Standards. They are designed to let teachers.
1 Here are some additional methods for describing data.
1 Here are some additional methods for describing data.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.
Identification, Assessment and Re-classification of English Learners Initial Identification  Complete within 30 school days of enrollment Administer Home.
English Word Origins Grade 3 Middle School (US 9 th Grade) Advanced English Pablo Sherman The etymology of language.
Analyzing Access For ELL Scores Tracy M. Klingbiel Nash Rocky Mount School District October 11, 2010.
The Learning Behaviors Scale
ACCESS for ELLs® Interpreting the Results Developed by the WIDA Consortium.
Qatar Comprehensive Educational Assessment (QCEA) 2007: Summary of Results.
Administering ELDA K & ELDA 1-2 English Language Development Assessment Assessing ELL Students in the Primary Grades Developed by the Limited English Proficient.
Review and Validation of ISAT Performance Levels for 2006 and Beyond MetriTech, Inc. Champaign, IL MetriTech, Inc. Champaign, IL.
 Closing the loop: Providing test developers with performance level descriptors so standard setters can do their job Amanda A. Wolkowitz Alpine Testing.
Algebra III / Geometry Welcome to my classroom Mr. Chance.
Advanced Higher Physics Investigation Report. Hello, and welcome to Advanced Higher Physics Investigation Presentation.
© 2007 Board of Regents of the University of Wisconsin System, on behalf of the WIDA Consortium WIDA Focus on Growth H Gary Cook, Ph.D. WIDA.
Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.
Trinity College London provides respected international qualifications across a range of disciplines in the performing and creative arts, and in English.
Fidelity of Implementation A tool designed to provide descriptions of facets of a coherent whole school literacy initiative. A tool designed to provide.
Evaluation Institute Qatar Comprehensive Educational Assessment (QCEA) 2008 Summary of Results.
Vertical Articulation Reality Orientation (Achieving Coherence in a Less-Than-Coherent World) NCSA June 25, 2014 Deb Lindsey, Director of State Assessment.
Proposed End-of-Course (EOC) Cut Scores for the Spring 2015 Test Administration Presentation to the Nevada State Board of Education March 17, 2016.
THE CALIFORNIA ENGLISH LANGUAGE DEVELOPMENT TEST (CELDT) Poway Unified School District.
Presentation to the Nevada Council to Establish Academic Standards Proposed Math I and Math II End of Course Cut Scores December 22, 2015 Carson City,
Effects of Word Concreteness and Spacing on EFL Vocabulary Acquisition 吴翼飞 (南京工业大学,外国语言文学学院,江苏 南京211816) Introduction Vocabulary acquisition is of great.
Objective: Students will identify 4 different note- taking strategies and evaluate through class discussion their advantages and disadvantages. Students.
ACCESS for ELLs Score Changes
Information for Parents Statutory Assessment Arrangements
California Assessment of STUDENT PERFORMANCE and PROGRESS
Information for Parents Key Stage 3 Statutory Assessment Arrangements
ESU Title III Update Fall 2016.
Information for Parents Statutory Assessment Arrangements
Interaction SIOP Chapter 6.
Spring 2014 Benchmark Data Is Here!!!
National Conference on Student Assessment Austin, Texas June 28, 2017
Key findings on comparability of language testing in Europe ECML Colloquium 7th December 2016 Dr Nick Saville.
Write your metaphors on the butcher paper
Release of PARCC Student Results
Next-Generation MCAS: Update and review of standard setting
Course name: Weekly Planning
Welcome to the Linguistic Instructional Alignment Guide Training
Director, Institutional Research
ELP Performance Level Descriptors
Office of Education Improvement and Innovation
Teaching the Full Range
Performance Task Overview
Updates on the Next-Generation MCAS
Kinematics Acceleration
SAT and Accountability Evidence and Information Needed and Provided for Using Nationally Recognized High School Assessments for ESSA Kevin Sweeney,
Timeline for STAAR EOC Standard Setting Process
ACCESS for ELLs Score Reports
Discussion and Vote to Amend the Regulations
Recognizing the Counting Sequence
Good Morning AP Stat! Day #2
What do they mean and how can I use them?
Workforce Engagement Survey
EPAS Educational Planning and Assessment System By: Cindy Beals
P ! A L S Interpreting Student Data to
The PARCC Vision PARCC states have committed to building a K-12 assessment system that:
SUPPORTING THE Progress Report in MATH
Administering K-2 WESTELL Spring 2006
Office of Strategy, Innovation and Performance
Donovan Elementary MCAS
Understanding the CAASPP Student Score Reports
Using the Rule Normal Quantile Plots
Presentation transcript:

English Language Development Assessment (ELDA) Vertically Moderated Standard Setting You are the Articulation Committee. Your task over the next two days is to take the twelve sets of recommendations made by the four committees meeting the last four days and turn them into a set of coordinated recommendations that fit together across grades and tests. This process is sometimes referred to as Vertically Moderated Standard Setting. Its goal is to derive a final set of cut score recommendations that looks reasonable when viewed across the span of grades.

Purpose Order standards across grade levels Smooth bumps Create ordered sequence of expectations You have agreed to stay for an extra two days to review all the recommendations from the four committees that met from Monday morning through this morning, to smooth out any bumps there might be from grades 3-5 to grades 9-12. We will examine each cut score in the context of all other cut scores. More importantly, we will examine each cut score in the context of the contents of the tests and the PLDs. When we leave here on Saturday, we will have developed a set of recommendations that establish an ordered sequence of expectations of students at the five performance levels across the three grade spans. You should be comfortable with these recommendations and I want to make sure they make sense to you in terms of percentages of students in various levels across the grade spans and in terms of the definitions of those levels.

Goals Make sense of ranges of cut scores Adjust cut scores to increase face validity Consider “global” standard Our goals are: To make sense of ranges of cut scores. Each committee (except Speaking, which examined all three grade spans) focused on a single grade span. The members of the 3-5 committee did not know what the members of the 6-8 committee were doing, and neither knew what the members of the 9-12 committee were doing. You have access to all sets of recommendations and all data. To adjust cut scores to increase face validity. You may find that 3-5 and 9-12, for example set cut scores that make 6-8 seem out of line. You will have an opportunity to recommend an adjustment in one or more cut scores for grades 6-8. Or you may find that there is a reasonable expectation moving from 3-5 to 6-8 but that there is a reversal at 9-12. There are all sorts of possibilities. You will examine all the recommendations and adjust any that seem out of line, in relation to the standards at other grade spans, in terms of percentages of students in the various levels, and in terms of the overall sense of what is expected of a student at a given level in a given grade span. To consider a “global” standard. The ELDA includes a score called Comprehension, which is a composite of Listening and Reading. We did not set Comprehension cut scores during the first part of the week. This group will develop a set of rules for that. Similarly, you will create a set of rules to derive a set of cut scores for an overall Composite score that includes all four components of the tests.

Things to Consider Distributions of students by grade and level Scale scores Match to classroom teacher judgments Here’s what we will consider as we examine the various cut score recommendations: Distributions of students by grade and level - You will see tables showing how many students would be classified at Levels 1, 2, 3, 4, and 5, given the current set of recommendations for all three levels for all four tests. Do these percentages make sense, both intrinsically and in relationship to other sets of percentages? Scale scores - the various grade spans have been developed with overlapping items to permit a single score scale that runs from grade 3 to grade 12. Looking at the cut scores moving up the grades, is there a consistent forward motion to the cut scores? Is there an increasing expectation for students at higher grade levels? Match to classroom teacher judgments - The teachers who administered the tests last spring also rated their students, from 1 to 5, using a rating scale similar to the PLDs we have used this week. Do the current cut scores make sense when we compare our percentages of students in the five levels with the percentages indicated by classroom teachers?

Activities Review results of 2 rounds of standard setting View tables and graphs Consider smoothing methods Recommend final cuts Consider “global” cut scores We have divided the work of this committee into five activities: Review results of 2 rounds of standard setting - You will take one test at a time; e.g., Listening, and see how the recommended cut scores for each performance level move across the grade spans. View tables and graphs - You will have access to all the data the individual committees reviewed as they made their recommendations. You will consider their recommendations in relation to these data to make sure they make sense to you. Consider smoothing methods - We have many ways to smooth the recommendations, or change one or more cut scores. We can look just at the scale scores corresponding to each cut and make sure they move up steadily across grade spans. We can look at percentages of students in each of the five performance levels by grade span. We can look at the contents of the tests and PLDs and modify cut scores to bring them more into line with our collective understanding of those definitions. Or we can use a combination of these approaches. Recommend final cuts - We will affirm or modify each of the four cuts for each of the 12 tests, 48 cut scores in all. We will do this by show of hands, cut score by cut score. A simple majority will carry each final cut score recommendation. Consider global cut scores - As I mentioned earlier, you will create a set of rules to define cut scores for Comprehension and a total Composite. We will provide forms and other aids to make your task as simple as possible.

Cut Scores Across Grade Spans Listening Reading Speaking Writing Let’s look at the cut scores in Rasch scale score terms across the three grade spans (3-5, 6-8, and 9-12) for all four tests. The next four charts show how these cut scores line up.

Here we see the cut scores for Listening Here we see the cut scores for Listening. All four lines move up from left to right in a fairly orderly fashion. This means that the cut scores for a given level (Beginner, Intermediate, etc.) go up from grade span to grade span, indicating that we expect more of an Intermediate student at grades 9-12, for example, than we do from an Intermediate student at grades 3-5 or 6-8. Part of this difference in expectations is due to normal maturation, and part of it is due to differences in the complexity of tasks facing high school students relative to those facing elementary or middle school students.

Here we see a similar pattern for Reading Here we see a similar pattern for Reading. The cut scores move up fairly steadily from one grade span to the next and certainly from one performance level to the next.

Similarly, we see relatively steady advancement in cut scores from one grade span to the next. We do see, however, a flat line for Intermediate, indicating that the cut score for grades 3-5 is about the same as for grades 6-8 and grades 9-12. We will want to talk about why this line is so flat. Two of the other lines, for Advanced and FEP, change rates. Advanced moves up fairly quickly from 3-5 to 6-8 and then flattens out, showing that we don’t expect much more from high school students at the Advanced level than we do of middle school students. For FEP, however, we rapidly accelerate our expectations of high school students, relative to middle school students, as shown by the sharp incline in the FEP line from 6-8 to 9-12, relative to its slope from 3-5 to 6-8. We will talk about this when we get to the Speaking test.

Finally, there is the Writing test Finally, there is the Writing test. Notice that all four of these lines dip down from 3-5 to 6-8 but then go back up again from 6-8 to 9-12. The Rasch scale cut scores for grades 3-5 are consistently higher, at every performance level, than those for grades 6-8, and frequently higher than those for 9-12. Only Advanced shows a higher expectation for high school students than for elementary students. We think there may be a scaling issue at work here, but we will examine that and other issues very carefully when we get to the Writing test.