1 The New Adaptive Version of the Basic English Skills Test Oral Interview Dorry M. Kenyon Funded by OVAE Contract: ED-00-CO-0130 The BEST Plus.


Similar presentations
Assessment types and activities

Administrators Meeting April 21, Key Areas of Grant-Based Monitoring Schools to be Served Instructional Assessments Instructional Strategies and.
Administering K-2 WESTELL. Overview l Purpose and Nature of WESTELL K-2 l Language Acquisition in Young Children l What You Will Need l Administration.
Woodcock Reading Mastery Test-Revised
Presented by Eroika Jeniffer.  We want to set tasks that form a representative of the population of oral tasks that we expect candidates to be able to.
Wortham: Chapter 2 Assessing young children Why are infants and Preschoolers measured differently than older children and adults? How does the demand for.
Spiros Papageorgiou University of Michigan
A Tale of Two Tests STANAG and CEFR Comparing the Results of side-by-side testing of reading proficiency BILC Conference May 2010 Istanbul, Turkey Dr.
General Information --- What is the purpose of the test? For what population is the designed? Is this population relevant to the people who will take your.
McKinley Community School for Adults CASAS Overview.
1 CASAS Overview Symposium on Issues and Challenges in Assessment and Accountability for Adult English Language Learners May 16, 2003 Washington DC Linda.
Computer Applications in Testing and Assessment James P. Sampson, Jr. Florida State University Copyright 2002 by James P. Sampson, Jr., All Rights Reserved.
The REEP Pre and Post Writing Assessment (RWA)
Daniel Boone Area School District English as a Second Language (ESL) Program.
ELL (English Language Learner) Program.  An ELL student is a student who:  Was not born in the United States  Or whose native language is not English.
Developmental Reading Assessment Thompson School District Fall 2012
Chapter 41 Training for Organizations Research Skills.
Test Evaluation ~assessing speaking Group Members Lulu Irena Crystal.
Needs Analysis Instructor: Dr. Mavis Shang
TABE Complete Language Assessment System – English™
Weaving Pathways: Interculturalism and Language
1 OWLTS Online World Language Testing Software Pittsburgh Public Schools Prismatic Consulting LLC.
1 NYSESLAT Training Copyright 2005 by Harcourt Assessment, Inc. NYSESLAT CONTENTS OF THIS OVERVIEW  Test features  Materials  Administration.
Systems Analysis and Design: The Big Picture
BEST Plus and BEST Literacy Directors’ Training Feb. 14, 2013Columbia, SC.
Consortia of States Assessment Systems Instructional Leaders Roundtable November 18, 2010.
A Review of the Test of English for International Communication TOEIC Gillian Luellen Educational Measurement at the University of Kansas TOEIC Purpose.
Technical College System of Georgia Office of Adult Education As required by the National Reporting System for Adult Education.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
McLendon and Polis1 An Administrator’s Guide to Assessment: A Menu of Assessment Options for MAERS and Instructional Guidance.
EDU 385 Education Assessment in the Classroom
+ What is th CELDT? What you need to know to be successful on this important exam.
Principles in language testing What is a good test?
Technical College System of Georgia Office of Adult Education As required by the National Reporting System for Adult Education.
Miller Function & Participation Scales (M-FUN)
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National.
Alternative Assessment
IFS410 – End User Support Chapter 11 Training Computer Users.
The ArtWorx Museum Training Program for Volunteers The heart of a volunteer is not measured in size, but by the depth of the commitment to make a difference.
NATO BAT Testing: The First 200 BILC Professional Seminar 6 October, 2009 Copenhagen, Denmark Dr. Elvira Swender, ACTFL.
Program Evaluation.
Target -Method Match Selecting The Right Assessment.
Evaluating Assessment for the NRS Evaluating Assessments for the NRS American Institutes for Research February 2005.
NRS JEOPARDY! The Adult Education Community’s Favorite Quiz Show.
Assessment and Testing
STAMP (Standards-based Measurement of Proficiency)
Assessment Developing an Assessment. Assessment Planning Process Analyze the environment Agency, clients, TR program, staff & resources Define parameters.
Georgia will lead the nation in improving student achievement. 1 Georgia Performance Standards Day 3: Assessment FOR Learning.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Adult Education Assessment Policy Effective July 1 st, 2011.
Fidelity of Implementation A tool designed to provide descriptions of facets of a coherent whole school literacy initiative. A tool designed to provide.
Appropriate Testing Administration
Tests can be categorised according to the types of information they provide. This categorisation will prove useful both in deciding whether an existing.
Dr. Antar Abdellah Types of testsThe nature of achievement testsBasic testing terminologyThe characteristics of a good achievement testDeveloping.
AUTHOR: NADIRAN TANYELI PRESENTER: SAMANTHA INSTRUCTOR: KATE CHEN DATE: MARCH 10, 2010 The Efficiency of Online English Language Instruction on Students’
California Assessment of Student Performance and Progress CAASPP Insert Your School Logo.
 WIDA MODEL: Grades 1-12 Measure of Developing English Language.
EL Program in a Nutshell EL Program Flow Chart.
Ready, Set, Start! Using CASAS in Your Even Start Program Ready, Set, Start! Using CASAS in Your Even Start Program Presenter: Martha Gustafson Date: June.
Presentation by Katya Arpon Marandino Irish May 2016.
Overview of Standards for Literacy in History/Social Studies, Science, and Technical Subjects The Common Core State Standards.
EYFS Head Teacher Briefings Summer New EYFS Profile Handbook and Exemplification EYFSP Pilot information.
AAPPL Assessment Follow Up June What is AAPPL Measure? The ACTFL Assessment of Performance toward Proficiency in Languages (AAPPL) is a performance-
TABE Complete Language Assessment System – English™ TABE CLAS-E
Math-Curriculum Based Measurement (M-CBM)
BEST Plus and BEST Literacy
Presentation transcript:

1 The New Adaptive Version of the Basic English Skills Test Oral Interview Dorry M. Kenyon Funded by OVAE Contract: ED-00-CO-0130 The BEST Plus

2 Overview 1. Why the BEST Plus? 2. What does the BEST Plus look like? 3. What is its research base? 4. How can the BEST Plus be used?

3 Overview Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?

4 The original BEST Oral Interview Developed early 1980s Assessed basic functional oral English language skills for adult immigrants and refugees Designed for program use Began to be widely used for accountability purposes

 1. Where is he?  2. In, where did you buy your food?  3. Is shopping in and the same? How is it different/the same?

6 The BEST Plus A performance-based assessment (individually administered face-to-face oral interview) Assesses functional oral language skills (interpersonal communication) of adult ESL learners using everyday language Designed with current assessment needs in mind

7 Goals in developing the BEST Plus Respond to adult ESL program needs for assessment and accountability – Produce a test that is short and practical – Assess learner language for a variety of purposes and stakeholders – Increase accuracy in measuring oral proficiency – Provide “multiple forms” for pre- and post-testing

8 Overview Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?

9 BEST Plus components (computer-based version) Test items appear on the computer screen (instead of in a test booklet) If an item requires a visual, examinees view the visual on the computer screen (instead of a picture cue booklet) Test administrators enter scores directly into the computer (instead of on a score sheet)


11 3. What does the computer-assisted BEST Plus look like?

12 3. What does the computer-assisted BEST Plus look like?

13 Sample computer screen

14 BEST Plus components (print-based version) Three forms Within each form, locator test + three level tests – SPL1-4 – SPL 4-6 – SPL 6-10 Materials – Picture booklet – Test booklet (scripts and score sheet combined)


16 Scoring on 3 components of proficiency Listening Comprehension = How well did the examinee understand the setup and question? Language Complexity = How did the examinee organize and elaborate the response? Communication = How clearly did the examinee communicate meaning?

17 Ability estimation After each question, the program estimates the examinee’s ability based on scores awarded on the current and all previous questions. With each estimation, the accuracy of the measurement increases. Goal: To ‘level off’ in estimation with acceptable level of accuracy.

18 Path through the computer-adaptive BEST Plus Following a fixed “warm-up,” examinees are asked questions drawn from several thematically-based “folders.” After hearing each response, the test administrator enters a score for each component. After each set of scores is entered, the computer updates its estimate of the examinee’s ability, and chooses folders and questions as appropriate. The test ends when one of three conditions is met. Users can instantly receive full score report.

19 Path through the print-based BEST Plus Administer and score Locator questions (the fixed “warm-up” items + 2 high end discriminators) Total score on Locator and choose level test based on chart Administer level test Total raw score and find approximate SPL range Enter raw scores into computer BEST Plus Score Management software to obtain full score report

20 Overview Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?

21 Rigorous development procedures Feasibility study ( ) Initial development ( ) Pilot, small scale field test, initial reliability study (2001) Revisions ( ) Pilot, full scale field test, reliability study, standard setting study (2002) Finalization of training materials, ancillary materials, further refinements (2003)

22 Full involvement of stakeholders OVAE oversight Technical Working Group (TWG), comprised of researchers, state directors, and local program directors and practitioners Item writers, comprised of experienced adult ESL teaching professionals Instructors and students in the field

23 Example: Full scale field test participants 9 states (DC, DE, FL, IL, MA, MD, OR, PA, VA) 23 programs 41 administrators 2420 examinees

24 Example: Reliability study adult ESL students Two testing rooms (A, B) Administrator (project staff) Observer/Co-Scorer (project staff) Observer/Co-Scorer (novice scorer) Each student was tested, then immediately retested in second room

25 Average interrater agreement Within administration (same room) Total ScoreRoom A (3 raters) Room B (3 raters)

26 Test/re-test reliability Between Rooms Final Ability Estimate

27 Example: Some initial validity evidences Analyses of ancillary data collected from program records during the field test, including test scores less than six months old Standard setting study

28 Correlations with program placement Range of Correlation Number of Programs Percentage.80 or above730.4%.70 to %.60 to %.50 to % Below % TOTALS23100%

29 Summary: Program placement correlations 69.5% of the correlations were.70 or higher

30 Example: Standard setting study 11 judges 30 student performances Performances (about 6 min each) arranged from lowest to highest Judgment made: “Which SPL is best characterized by this performance?” Judges were able to complete this task relating the SPL descriptors to the observed performances

31 Overview Why the BEST Plus? What does the BEST Plus look like? What is its research base? How can the BEST Plus be used?

32 The BEST Plus Score Report Information includes: – BEST Plus Scale Score – SPL level – NRS level – Diagnostic information

33 Uses of the BEST Plus Accountability – National Reporting System (NRS), as scores on the BEST Plus relate to the 6 NRS levels for Speaking and Listening – Program Evaluation

34 Standard setting outcome (SPLs) SPLScale Score Range 0Below Above 795

35 Standard setting outcome (NRS) NRS LevelRelated SPLBEST Plus Scale Scores Beginning ESL Literacy0-1Below 401 Beginning ESL Low Intermediate ESL High Intermediate ESL Low Advanced ESL High Advanced ESL7 or moreAbove 540

36 Uses of the BEST Plus Within Programs – Placement – Progress – Diagnosis – Screening

37 Diagnostic score report information

38 Example (diagnostic information) Relative to other SPL 5s, current examinee is: Low in listening High in complexity Average in communication

39 Questions and discussion

40 --Thank you